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(57) Abstract 

Polypeptides are disclosed diat are useful for diagnosing American Trypanosomiasis, or Chagas disease, a disease caused by the 
infectious agent Trypanosoma arm. The polypeptides have a sequence that coiresponds to the amino acid sequence of at least one of the 
C-tenninal and N-terminal nonxepetitive regions of TCR27 protein. The polypeptide additionally may comprise an amino acid sequence of 
one or mm repeats from the central regicm of TCR27 protein. In a preferred embodiment, the polypeptide corresponds to the N-terminal 
nomcpetitive region of TCR27 protein and at least one rq)eat from the central region of TCR27 protem, and does not conespond to the 
C-terminal nonrcpetitive region. The polypeptides may further comprise a linker sequence at either the N-terminus or the C-terminus to 
facilitate attachment or oonjogation to a cairier molecule hi a liquid or solid support system for use in a sensitive assay for detecting T, 
arm mfectioa. 
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POLYPEPTIDES FOR DIAOIOSING INFECTION 
WITH TEYPANOSOMA CRUZI 
BACKGROUND OF THE THVENTIQN 

The present invention relates to polypeptides that 
5 are useful for diagnosing American trypanosomiasis, or 
Chagas disease, a disease caused by the infectious agent 
Trypanosoma cruzi. More particularly, the invention 
relates to recoiabinsmt T. cruzl polypeptides, synthesized 
using genetic engineering techniques, and to constructs 

10 and processes for producing the recombinant polypeptides, 
and to an assay for detecting T. cruzl infection ^ich 
employs the polypeptides. 

American trypanosomiasis, or Chagas disease, is an 
illness caused by the protozoan parasite, T. cruzi (1,2) . 

15 This organism is transmitted by insects called reduviid 
bugs (3) , by blood transfusion (4) , and also from mother 
to fetus (5) . Several years after acquiring T. cruzi 
infection, patients may develop the cardiac and 
gastrointestinal symptoms that are associated with 

20 chronic infection, which is life-long, but the majority 
of infected persons never develop clinical manifestations 
of Chagas disease and are unaware of being infected. The 
two drugs available for treating T. cruzl infections have 
low efficacy and often cause serious side effects. In 

25 practice, therefore, they have virtually no impact on the 
control of Chagas disease. 

Chagas disease is a major cause of morbidity and 
death in Latin America, where an estimated 16-18 million 
people are chronically infected with T. cruzi (6) . In 

30 recent years tens of thousands of T. cruzi -infected 
people have emigrated to the United States, especially 
from Central America, where the prevalence of T. cruzi 
infection is high, thus creating the risk of transfusion- 
associated transmission of the parasite here f7-5;. 

35 Several such cases have been described (10-12) . 

Since clinical criteria cannot be depended upon for 
recognizing T. cruzi infection, blood tests are of 
paramount importance, both in patient care settings and 



WO»Sa5797 PCT/DS9Sfll3191 

- 2 - 

in blood bemks. Chronically infected persons uniformly 
have anti-r. cru^i antibodies. The diagnosis of T. cruzi 
infection is almost always made by detecting these 
antibodies in patients' blood, since parasitological 
5 approaches are laborious and lack sensitivity and, as 
noted, clinical evaluations lack specificity. 

Immunological tests currently used to diagnose 
r. cruzl infection, such as complement fixation and 
indirect immxinof luorescence tests, and enzyme-linked 

10 iiamunosorbent assays (ELISA) , often produce inconsistent 
results and false-positive reactions (13 ) . The 
occurrence of false-positive reactions can be a problem 
with' specimens from patients with leishmaniasis, 
schistosomiasis, and other parasitic and infectious 

15 diseases, with samples from patients with autoimmxine 
disorders and other illnesses, and with specimens from 
normal persons. 

In large measure these problems with sensitivity and 
specificity occur because the assays are based on 

20 antigens extracted from parasites grown in the 
laboratory. The complexity and variability of mixtures 
of native antigens derived from cultured parasites, which 
persist even after fractionation and piirif ication 
procedures are used, have been a major barrier to 

25 standardization of immunoassays. Because of the 
limitations of these immxmoassays , experts generally 
agree that blood samples should be positive in three 
different assays, performed in parallel, before being 
accepted as positive. 

30 An additional problem related to assays based on 

material derived from cultxired parasites is that 
producing the antigens creates a serious biohazard for 
technical personnel, and laboratory-acquired cases of 
Chagas disease occur with disquieting frequency, both in 

35 the United States and abroad (14,15). Furthermore, some 
of the immunoassays currently available require 
sophisticated laboratory equipment and levels of 
technical expertise not generally available in the 
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countries in which T. cruzi infection is endemic. 

In response to the need for improved assays for 
detecting T. cruzi infection, consider€a>le work has been 
invested in the development of new immunoassays. These 
5 efforts have accelerated in recent years as new 
technologies have become available that have the 
potential for serving as the basis of improved assays. 
Recombinant DNA technology has led to the molecular 
cloning of several antigenic r. cnizi proteins. Cloned 

10 segments of T. cruzi genes have been used to produce in 
bacteria portions of wtigenic proteins (16-22). in 
research settings several of these, singly and in 
combination, have been used as target antigens in 
immunoassays. These assays have not been tested in field 

15 or blood bank trials, and none is available commercially. 

United States patent No. 4,870,006 discloses the use 
of a recombinant protein in an assay for diagnosing 
r. cruzi infection. A 70-kilodalton heat shock protein 
constitutes the target antigen in this assay. No 

20 information regarding the sensitivity and specif icity of 
the assay is provided in the patent. 

In this context, therefore, a need exists for a 
highly sensitive and specific system for detecting 
T. cruzi infection that is safe, easy, and inexpensive to 

25 manufacture and perform. 

SUMMMIY OF THE INVENTION 
It is therefore an object of the present invention 
to provide a highly sensitive and specific assay for 
diagnosing infection with T. cruzi. 

30 It is a further object of the present invention to 

provide an assay for diagnosing r. cruzi infection that 
is safe, inexpensive to manufacture and easy to use. 

In achieving these and other objects, there has been 
provided, according to one aspect of the present 

35 invention, a polypeptide having a sequence that 
corresponds to the €anino acid sequence of at least one of 
the C-*terminal and N-terminal nonrepetitive regions of 
the TCR27 protein. The inventive polypeptide 



wo 95/25797 



- 4 - 



PCr/US9Sm3191 



additionally may comprise an amino acid sequence of one 
or more repeats from the central region of the TCR27 
protein. In a preferred embodiment, the polypeptide 
corresponds to the N-terminal nonrepetitive region of the 
5 TCR27 protein and at least one repeat from the central 
region of the TCR27 protein, and does not correspond to 
the C-terminal nonrepetitive region. The polypeptides 
may further comprise a linker sequence at either the N- 
terminus or the C-terminus to facilitate attachment or 

10 conjugation to a carrier molecule in a liquid or solid 
support system. Isolated polynucleotides that encode the 
inventive polypeptides according to the present invention 
are also claimed, as are cells transformed with a 
recombinant plasmid that expresses a polypeptide 

15 according to the invention. 

The present invention also provides a method for 
detecting the presence of antibodies to 2P. cruzi in an 
individual, comprising the steps of contacting a putative 
anti-r. cruzi antibody-containing sample from an 

20 individual vith a polypeptide according to the invention 
that is attached or conjugated to a ceurrier molecule or 
attached or conjugated to a solid phase; allowing anti-I*. 
crvLzi antibodies in said sample to bind to said 
polypeptide; washing away unbound anti-r. cruzi 

25 antibodies; and adding a compound that enables detection 
of the anti-r • cruzi antibodies which are specifically 
bound to the polypeptide. The compoimd that enables 
detection of the anti-T. cruzi antibodies may be selected 
from the group consisting of a colorometric agent, a 

30 fluorescent agent, a chemi luminescent agent and a 
radionuclide • 

Also provided in accordance with the present 
invention is a kit for diagnosing the presence of anti-r. 
cruzi antibodies in a sample, comprising a container in 
35 which a polypeptide having a sequence that corresponds to 
the amino acid sequence of at least one of the C-terminal 
and N-terminal nonrepetitive regions of the TCR27 protein 
is attached or conjugated to a carrier molecule or 
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attached or conjugated to a solid phase; and directions 
for carrying out the method according to the invention. 
The kit additionally may comprise a container of a 
compound that binds to anti-r. cruzl antibodies and that 
5 renders said antibodies detectable. 

Other objects, features and advantages of the present 
invention will become apparent from the following 
detailed description. It should be understood, however, 
that the detailed description and the specific examples, 
10 while indicating preferred embodiments of the invention, 
are given by way of illustration only, since various 
changes and modifications within the spirit and scope of 
the invention will become apparent to those skilled in 
the art from this detailed description. 

15 BRIEF DESCRIPTION OP TOE DMlWTlgCg 

Figure 1 is a schematic diagram of the T. cruzi TCR27 
gene and the segments of the gene that encode 
polypeptides according to the present invention. 

Figures 2A through 2E show the nucleotide and deduced 
20 amino acid sequences (SEQ ID NOS l-io res[ectively) of 
polypeptides according to the present invention. 

Figures 3A through 3F are bar graphs of results 
obtained when recombinant TCR27 polypeptides are used as 
target antigens in HLISAs to test blood samples (serum or 
25 plasma) for anti-T. cruzi antibodies. 

DETAILED DESCRIPTION OP PREFERRED EMBQDIMEMTa 
It has been discovered that a r. cruzi gene 
designated ••TCR27W (23) encodes an immunodominant protein 
containing unique, nonrepetitive regions at both the 
30 C-terminus and N-terminus, in addition to a central 
region comprised of repeats of a 14-amino acid sequence. 
It has been further discovered that there are two copies 
of the TCR27 gene that essentially differ only in the 
number of repeats that comprise the central region. It 
35 also has been discovered that the nonrepetitive terminal 
regions of the TCR27 protein contain epitopes to which 
individuals infected with r. cru^i typically have 
antibodies. The existence of these epitopes within the 
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nonrepetitive regions was not suggested previously. 

More particulcurly, the native protein encoded by the 
TCR27 gene consists of an N-terminal 95-amino acid 
secpience and a C-terminal 6 8 -amino acid sequence. A 
5 central region of repeats encodes 69 repeats of a highly- 
conserved, 14-amino acid sequence. In accordance with 
the present invention, a polypeptide that corresponds to 
at least one of the C-terminal or N-terminal 
nonrepetitive regions can form the basis for a sensitive 

10 assay to diagnose T. cruzi infection. 

In one preferred embodiment, such a polypeptide 
corresponds to at least one of the C-terminal or 
N-terminal nonrepetitive regions in combination with a 
region of one or more repeats from the central region of 

15 the TCR27 protein. In a particularly preferred 
embodiment, a polypeptide for use in an assay according 
to the present invention contains the N-terminal 
nonrepetitive region in combination with one or more 
repeats from the central region of the TCR27 protein, but 

20 does not contain a region corresponding to the C-terminal 
nonrepetitive region. Polypeptides according to the 
present invention that include repeat regions in addition 
to one of the nonrepetitive regions will contain at least 
one, and preferably at least two, copies of the 14-amino 

25 acid repeat. 

In addition to the nonrepetitive and repeat regions 
per se, a wide variety of polypeptides which contain the 
epitopes embodied in these regions can be used in 
accordance with the present invention. Based on the 

30 nucleotide sequences in Figures 2 A through 2E (SEQ ID NOS 
1, 3, 5. 7 and 9 respectively), polypeptide molecules 
also can be produced (1) that Include sequence 
variations, relative to the naturally-occurring 
sequences, (2) that have one or more amino acids 

35 truncated from the nattirally-occvirring sequences and 
variations thereof, or (3) that contain the naturally- 
occurring sequences and variations thereof as part of a 
longer sequence. 
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In this description, polypeptide molecules in 
categories (1) , (2) and (3) are said to correspond** to 
the amino acid sequences of the nonrepetitive or repeat 
regions of the TCR27 protein. Such polypeptides also are 
5 referred to as **vari£uits • ** The category of variants 
within the present Invention includes, for example, 
fragments and muteins of the nonrepetitive and repeat 
regions, as well as larger molecules that consist 
essentially of one or both of the nonrepetitive 

10 sequences, alone or in combination with one or more 
repeats from the central region. 

In this regard, a molecule that **consists essentially 
of*' one or both of the nonrepetitive sequences, jalone or 
in combination with one or more repeats from the central 

15 region, is one that reacts immunologically with samples 
from persons infected with T. crazi, but that does not 
react with samples from patients with leishmaniasis, 
schistosomiasis, and other parasitic and infectious 
diseases, with samples from patients with autoimmune 

20 disorders and other illnesses, and with specimens from 
normal persons. 

A "mutein"* is a polypeptide that is homologous to the 
nonrepetitive or repeat region to which it corresponds, 
and that retains the basic functional attribute the 

25 ability to react selectively with samples from persons 
infected with r. cruzi — of the corresponding region. 
For ptirposes of this description, "homology" between two 
sequences connotes a likeness short of identity 
indicative of a derivation of the first sequence from the 

30 second. In particular, a polypeptide is '"homologous" to 
the corresponding nonrepetitive or repeat region if a 
comparison of amino-acid sequences between the 
polypeptide and the corresponding region reveals an 
identity of greater than 70%. Such a sequence comparison 

35 can be performed via known algorithms, such as the one 
described by Lipman and Pearson (24) , which are readily 
implemented by computer. Polypeptides derived from other 
strains and clones of T. cruzi that are homologous to the 
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sequences shown in Figxires 2A through 2E constitute 
naturally-occurring muteins and are within the scope of 
the present invention. 

A fragment of a nonrepetitive or repeat region is a 
5 mplecule in which one or more amino acids are truncated 
from that nonrepetitive or repeat region. Nuteins and 
fragments can be produced, in accordance with the present 
invention, by known de novo synthesis techniques. 

Also exemplary of variants within the present 
10 invention are molecules that are longer than a 
nonrepetitive or a repeat region but that contain the 
region or a mutein thereof within the longer sequence. 
For example, a variant may include a fusion partner in 
addition to the nonrepetitive or repeat region. Such a 
15 fusion partner may allow easier pturif ication of 
recombinant ly-produced polypeptides. For example, use of 
a glutathione-S-transf erase (26 kilodaltons, GST) fusion 
partner allows purification of recombinant polypeptides 
on glutathione agarose beads. 
20 The portion of the sequence of such molecule other 

than that portion of the sequence corresponding to the 
region may or may not be homologous to the sequence of 
the TCR27 protein. If it is homologous with the TCR27 
protein, it is not coincident with the sequence of the 
25 TCR27 protein. 

It will be appreciated that polypeptides shorter than 
the corresponding nonrepetitive region but that retain 
the ability to react selectively with samples from 
persons infected with 5P. cruzl are suitable for use in 
30 the present invention. Thus, variants may be of the same 
length, longer than or shorter than the nonrepetitive or 
repeat regions, and also include sequences in which there 
are amino acid substitutions of the parent sequence. 
These variants must retain the ability to react 
35 selectively with samples from persons infected with 
T. cruzi. 

Whether a polypeptide based on one of the sequences 
shown in Figures 2A through 2E (SEQ ID NOS l-io 



9S/2SI91 



- 9 - 



PCT/US9S/03191 



respectively) retains the ability to react selectively 
with samples from persons infected with T. cruzl can be 
determined routinely in accordance with the protocols set 
forth herein^ that is, by reacting it with serologically 
well-characterized specimens from patients known to be 
infected with cruzi, and with similarly serologically 
well*characterized specimens from patients known to be 
affected with those conditions that typically cause false 
positive reactions in assays for antibodies to cruzi, 
such as leishmaniasis, schistosomiasis, and other 
parasitic and infectious diseases, with samples from 
patients with autoimmune disorders and other illnesses, 
and with specimens from normal persons. 

A schematic diagram of the TCR27 gene is shown in 
Figure !• The horizontal rectangle depicts the protein 
encoding region of the TCR27 gene, which contains a 
central segment consisting of approximately 69 highly 
conserved repeats, each 42 nucleotides in length, flanked 
on both sides by dissimilar, nonrepetitive sequences. 
Restriction sites are indicated by A (Avail) , P (PvuII) , 
and H (HincIII) . The positions of the segments of the 
TCR27 gene that encode polypeptides which are 
representative of the present invention are indicated 
the solid horizontal bars. Thus, polypeptide Ag2-*2 is 
encoded by the nonrepetitive, upstream DNA segment of the 
TCR27 gene, polypeptide Agl5 by that nonrepetitive 
segment plus 16 of the 42-nucleotide repeat units, 
polypeptide Ag8 by a segment consisting of 15 of the 
42-nucleotide repeat units, and polypeptide Ag4 by the 
nonrepetitive, downstream segment of the TCR27 gene. 
Also, the coding region for polypeptide Ag44 consists of 
the nonrepetitive, upstream coding region of the TCR27 
gene, followed by a segment containing 16 repeats, 
followed by the noiurepetitive, downstream coding region 
of the TCR27 gene. The dashed double arrow indicates 
that the two depicted segments of Ag44 are combined in 
one continuous coding sequence. 

Figure 2A through Figure 2E show the nucleotide and 
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deduced amino acid sequences (SEQ ID NOS 1-10 
respectively) for Agl5, Ag2-2, Ag4, Ag44 and Ag8, 
respectively. The DNA letter codes are: A, adenine; 
C, cytosine, G, guanine, and T, thymine. The amino acid 
5 codes are: A, alanine; cysteine; aspartic acid; 
E, glutamic acid; P, phenylalanine; glycine; 
H, histidine; I, isoleucine; K, lysine; L, leucine; 

methionine; N, asparagine; P, proline; Q, glutamine; 
R, arginine; S, serine; T, threonine; V, valine; 

10 W, tryptophan; tyrosine, stop codons are indicated by 
a single asterisk. 

The five TCR27 gene segments that encode recombinant 
polypeptides according to the invention are inserted into 
plasmid pGEX (25) . The gene encoding GST is positioned 

15 upstream from the Smal site into which the TCR27 segments 
are inserted, and thus the recombinant polypeptides 
encoded by these plasmids have GST attached to their N- 
. termini. The presence of GST allows purification of the 
recombinant polypeptides on glutathione agarose beads, as 

20 described below, but it will be readily apparent to those 
of ordinary skill in the art that the GST fusion partner 
can be cleaved from polypeptides to be used in an assay 
according to the invention. 

Figure 2A shows DNA and deduced amino acid sequences 

25 (SEQ ID NOS 1 and 2 respectively) of Agl5, which is a 
GST-TCR27 polypeptide-pGEX-2T poly linker fusion protein. 
GST is encoded by nucleotides 1 through 681, which are 
derived from pGEX-2T. The segment of the T. cruzi TCR27 
protein that constitutes part of Agl5 is encoded by 

30 nucleotides 682 through 1671. The seven-amino acid 
secpience that constitutes the C-terminus of Agl5 is 
encoded by nucleotides 1672 through 1695, which is the 
PGEX-2T polylinker remnant that lies downstream from the 
Smal site. 

35 Figure 2B shows DNA and deduced amino acid sequences 

(SEQ ID NOS 3 and 4 respectively) of Ag2-2, which is a 
GST-TCR27 polypeptide-pGEX-2T polylinker fusion protein. 
GST is encoded by nucleotides 1 through 681, which are 
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derived from pGEX-2T. The segment of the cruzi TCR27 
protein that constitutes part of A92-2 is encoded by 
nucleotides 682 through 1041. The seven-amino acid 
sequence that constitutes the C-terminus of Ag2-2 is 
5 encoded by nucleotides 1042 through 1065 which is the 
P6EX-2T polylinker remnant that lies' downstream from the 
Smal site. 

Figure 20 shows DNA and deduced amino acid sequences 
(SEQ ID NOS 5 and 6 respectively) of Ag4, which is a 
10 GST-TCR27 polypeptide fusion protein. GST is encoded by 
nucleotides 1 through 663, which are derived from pGEX-1. 
The segment of the T. cruzi TCR27 protein that 
constitutes part of Ag4 is encoded by nucleotides 664 
through 924. 

15 Figure 2D shows DNA and deduced amino acid sequences 

(SEQ ID NOS 7 and 8 respectively) of Ag44, irtiich is a 
GST-TCR27 polypeptide fusion protein. GST is encoded by 
nucleotides 1 through 681, which are derived from pGEX- 
2T. The segment of the T. cruzi TCR27 protein that 

20 constitutes part of Ag44 is encoded by nucleotides 682 
through 1932. 

Figure 2£ shows DNA and deduced amino acid seqpiences 
(SEQ ID NOS 9 and 10 respectively) of Ag8, which is a 
fusion protein consisting of the following polypeptides: 

25 (1) GST is encoded by nucleotides 1 through 678, which 
are derived from pGEX-3X; (2) a six-amino acid sequence 
is encoded by nucleotides 679 through 696, which are 
derived from the region of the polylinker region of 
pBluescript (26) that lies between the BamHI and BcoRl 

30 sites; (3) the segment of the T. cruzi TCR27 protein that 
constitutes part of Ag8 is encoded by nucleotides 697 
through 1374; (4) a seven-amino acid sequence is encoded 
by nucleotides 1375 through 1395, which are derived from 
the region of the polylinker region of pBluescript that 

35 lies between the £coRV and Hindi sites ; and (5) a 
seven-amino acid sequence that constitutes the C-terminus 
of Ag8 is encoded by nucleotides 1396 through 1419 which 
is the pG£X-3X polylinker remnant that lies downstream 
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from the Hindi site. 

The presence of GST in these five fusion polypeptides 
allows purification of the recombinant polypeptides on 
glutathione agarose beads, as described below, but it 
5 will be readily apparent to those of ordinary skill in 
the art that the GST fusion partner can be cleaved from 
polypeptides to be used in an assay according to the 
invention. 

Polypeptides useful in an assay according to the 

10 invention can be synthetic peptides made by chemical 
synthesis techniques, but preferably are produced by 
recombinant techniques. DNA encoding the polypeptides 
preferably is obtained by cloning and recombination of 
DNA segments of the TCR27 gene. These DNA segments are 

15 utilized to produce recombinant polypeptides in bacteria. 
The N-termini or the C-termini of these polypeptides can 
be modified, respectively, to include a linker sequence 
that facilitates attachment or conjugation of the 
portions of the polypeptides that constitute the reactive 

20 epitopes to carrier molecules in solution or to solid 
support systems. In addition, the DNA sequences that 
encode the recombinant polypeptides may be modified such 
that the amino acid sequences described herein are not 
altered, or they may be altered such that the 

25 polypeptides are shortened or lengthened, or have eunino 
acid substitutions that are preferably conservative. 

The present invention further relates to methods for 
diagnosing T. cruzi infection by detecting €mtibodies 
that bind specifically to epitopes contained in the 

30 inventive polypeptides. The method consists of bringing 
into contact a sample of whole blood, or an antibody- 
containing component of blood, with a polypeptide, 
according to the invention, that is attached or 
conjugated to a carrier molecule or solid phase. After 

35 a period of contact between the seunple and the 
polypeptide, during which antibodies in the sample are 
bound to the polypeptide, unbound antibodies are washed 
away. The bound antibodies are then visualized or 



wo 9505797 PCT/DS9S/03191 

- 13 - 

Otherwise detected by adding a compoxind or compoiinds that 
detect the antibodies which are specifically bound to the 
polypeptides. Exemplary of compoimds that enable 
detection of the anti-2*. cru^i antibodies are 
5 colorometric agents, fluorescent agents, chemiliiminescent 
agents and radionuclides. 

A significant feature of the present invention is 
that it enables the use of a well-defined T. cruzi 
antigen, to which a large number of infected individuals 

10 produce antibodies, in a method of diagnosing T. cruzi 
infection. In accordance with the present invention, 
preparations formulated from polypeptides which are 
produced recombinantly or by chemical synthesis, 
respectively, are ''substantially pure." That is, they do 

15 not contain other proteins or polypeptides of S*. cru^i 
origin, in contrast to antigenic preparations derived 
from cultured parasites. Such crude preparations are 
complex 2md variable in constituency, and typically 
contain a variety of T. cruzi antigens even after 

20 fractionation and purification procedures are used. Some 
of these other antigens are cross-reactive with other 
antibodies produced in response to other parasitic and 
infectious diseases, and to some noninfectious diseases 
as well, giving rise to false positives. This has been 

25 a major barrier to standardization of immunoassays for 
diagnosis of T. cruzi. 

A high percentage of blood specimens from T. cruzi- 
infected persons from six different Latin American 
countries had easily demonstrable specific antibodies to 

30 polypeptides according to the invention, whereas 
specimens from normal persons did not. Equally 
important, specimens from patients with diseases that are 
often associated with false-positive reactions, such as 
leishmaniasis, schistosomiasis, and other parasitic and 

35 infectious diseases, as well as autoimmune disorders, did 
not produce false positives In assays with polypeptides 
according to the present invention. Thus, the present 
polypeptides are useful for diagnosing infection with 
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a*, cruzi. 

Results of assays with various polypeptides aire shown 
in Figures 3 A through 3F. Two panels of specimens were 
used. The first panel consisted of twelve serologically 
5 well-characterized specimens from T. cru^i -infected 
patients from six Latin American countries, and twelve 
control specimens from healthy persons, half from Latin 
America and half from the United States, The second 
panel of specimens consisted of twelve serologically 
10 well-characterized specimens from r. cruzi -infected 
patients from five Latin American countries, and 44 
control specimens from patients with the following 
conditions (# of patients) : 

visceral leishmaniasis (8) 
15 cutaneous leishmaniasis (8) 

autoimmune disease (6) 

schistosomiasis (4) 

toxoplasmosis (2) 

pnexunocystosis (2) 
20 syphilis (1) 

and healthy persons (13). 

The T. cruzi-infected patients in the two panels were not 
selected because of high or low antibody titers, as 
determined in conventional immunoassays, and the two 
25 groups of twelve T. cruzi-infected patients did not 
overlap. 

Figure 3A presents results obtained when Agl5 was 
reacted with specimens in Panel 2 in an ELISA. The 
vertical bars indicate mean absorbance values for the 

30 r. cruzi-infected and iininfected groups. Standard 
deviations are indicated by the lines projecting from the 
bars. The ratio of the mean absorbance value of the 
T. cruzi-infected patients to that of the controls was 
4:1, suggesting that Agl5 can serve as the basis for 

35 sensitive and specific assays for detecting T. cruzl 
infection. 

Results obtained when Ag2-2 was reacted with 
specimens in Panel 1 in an ELISA are shown in Figure 3B. 



t 
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The ratio of the meam absorbance value of the T. cruzl- 
infected patients to that of the controls was 1.5:1. 
While this was considerably less than the ratio of 
absorbance values obtained with Agl5, the results do 
5 indicate clearly that many T. cru^i -infected patients 
have antibodies that bind specifically to epitopes 
present on the nonrepetitive, upstream portion of the 
TCR27 protein and that Ag2-2 can be used in an assay for 
detecting T. cruzi infection. 

10 Figure 30 shows results obtained when Ag4 was reacted 

with specimens in Panel 1 in an ELISA. The ratio of the 
mean absorbance value of the T. cruzl-infected patients 
to that of the controls was 1.7:1. This ratio of 
absorbance values again was considerably less than the 

15 ratio obtained with Agl5, but as was the case with Ag2-2 
the results indicate clearly that many T. cruzl-infected 
patients have antibodies that bind specifically to 
epitopes present on the nonrepetitive^ downstream portion 
of the TCR27 protein and that an assay for detecting 

20 T. crazl infection can be based on Ag4. 

Results obtained when Ag44 was reacted with specimens 
in Panel 2 in an ELISA are presented in Figure 3D. The 
ratio of the mean absorbance value of the 2*. cruzi- 
infected patients to that of the uninfected persons was 

25 2:1, suggesting that Ag44 can serve as the basis for 
sensitive and specific assays for detecting T. cruzi 
infection. 

Figure 3E displays results obtained when Ag8 was 
reacted with specimens in Panel 2 in an ELISA. The ratio 

30 of the mean absorbance value of the T. cruzi-infected 
patients to that of the controls was 1.5:1. This is less 
than the ratios obtained with Agl5 and Ag44, thus 
suggesting that assays based on the latter antigens will 
be more discriminative than assays based on Ag8. 

35 Results obtained when GST alone was reacted with 

specimens in Panel 2 in an ELISA are displayed in Figure 
3F. The ratio of the mean absorbance value of the 
T. cruzi-infected patients to that of the controls is 
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1:1, indicating xinambiguously that the ability of the 
assays based on the recombinamt TCR27 proteins to 
discriminate between specimens from T. cru^i-infected 
patients and those of controls is based on antibody 
5 binding to the r. cruzi portions of the fusion proteins, 
rather than on reactivity with GST. 

The present invention can be \mderstood further with 
reference the following, non-limiting examples. 
Example X. Propagation and Isolation of Parasites 

10 Epimastigotes of the Sylvio X-10/4 clone of T. cruzi 

(27) were maintained in logarithmic growth phase at 26^C 
in supplemented liver digest neutralized medium and 
harvested as described earlier (28) . Mixtures of 
epimastigotes and culture-derived metacyclic 

15 trypomastigotes (CMT) ("1:1) were produced in supplemented 
Grace's insect medium, and purified CMT (>90%) were 
obtained by passing the mixture through a DE52 column. 
Example 2. Construction of oDNA Expression Library 

RNA was isolated from purified Sylvio X-10/4 CMT as 

20 described (29) and cDNAs were synthesized from total RNA, 
without prior isolation of poly (A) + RNA, with Moloney 
murine leukemia virus reverse transcriptase in the BRL 
Synthesis System (Bethesda Research Laboratories, 
Gaithersburg, MD) . After treatment of the cDNAs with 

25 ^coRI methylase, J^coRl linkers were attached and the 
cDNAs were ligated into bacteriophage ZAP (Stratagene, 
San Diego, CA) . After packaging of the recombinant phage 
with GigaPack Gold (Stratagene), a library of 6.4 x 10* 
independent clones was obtained, and 5 x 10* clones were 

30 amplified in E. call Y1090. 

Example 3. Immunoscreening the cDNA Library and 
Isolation of a TCR27 cDMA 

Serum from a Bolivian patient with clinically 

apparent Chagas disease, whose infection with T. cruzi 

35 had been established parasitologically and by 

conventional serologic assays (30), was used for 

immunoscreening. The amplified cDNA library was 

immunoscreened as described previously (31) using 
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horseradish peroxldase-conjugated goat anti- 
inmiunoglobulin 6 as secondary antibody. Approximately 30 
strongly reactive phage were identified, and recombinant 
pBluescript plasmids were recovered from purified 
5 roiactive ZAP clones by coinfecting colx XLl-Blue with 
the recombinant phage and R408 helper phage (26) . 
Nucleotide sequences of cloned cDNAs were determined 
using the Seguenase kit (U*S. Biochemicals, Cleveland, 
OH) . 

10 One of the cDNAs isolated by this approach, 

designated "TCR27," is 1,660 nucleotides in length and 
has a 1,230 nucleotide single open reading frame as well 
as a poly A tail. The upstream segment of this cDNA 
encodes 25 highly conserved 14*-amino acid repeats, and 

15 the portion of the coding region downstream from this 
repetitive region encodes a dissimilar and nonrepetitive 
68-amino acid sequence (17) • 

Example 4« construction of the Genomic Library and 
Isolation o£ a Full-Length TCR27 Gene 

20 Genomic DNA was isolated from 6 x lo' Sylvio X-10/4 

epimastigotes as described (32) . A genomic library was 
constructed in bacteriophage FIX using the procedures 
suggested by the supplier of the vector (Stratagene) • 
Approximately 100,000 phage plaques were screened by 

25 hybridizing radiolabeled TCR27 cDNA to phage DNA botmd to 
nitrocellulose filters using standard procedures (33) • 
Six recombinant phage-bearing inserts containing at least 
a segment of a TCR27 gene were identified, and one, which 
was approximately 9.5 kilobases in length, was 

30 characterized in detail after cloning into plasmid 
pBluescript* 

DNA of the pBluescript clone bearing the 9 . 5 kilobase 
TCR27 fragment was prepared as described (33) and 
analyzed by digestion with various restriction 
35 endonucleases, electrophoresis in 1% agarose gels, and 
visualization under UV illumination. Information 
obtained through restriction mapping and DNA sequencing, 
performed using the Sequenase kit (U.S. Biochemicals) and 
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on an automated DNA sequencer (ABI, Foster City, CA) was 
used to construct the schematic diagram of the TCR27 gene 
shown in Figure l. The salient features of the TCR27 
gene include a "2.9 Icilobase central region that encodes 
5 69, of the highly conserved 14-amino acid repeats. This 
central region is flanked upstream and downstream by 
dissimilar and nonrepetitive regions that encode 95- and 
68-amino acid sequences respectively. 

Example 5. Construction of Recombinant Plasmids 

10 Containing Segments of the TCR27 Gene 

Plasmid encoding Aal5, 

Recombinant pBluescript DNA bearing the TCR27 gene 
was digested with Avail and JSTincII and the resulting 3.8 
kildbase fragment, after isolation by electrophoresis and 

15 filling in the Avail end, was cloned into the Smal site 
of P6EX-2T (Pharmacia Biotech, Piscataway, NJ) (25) using 
standard procedures (33) . After production of DNA of the 
latter recombinant plasmid, designated pTCR27-7, a BaniHI- 
BcoRL fragment was isolated and was subjected to partial 

20 digestion with Pioill, which cuts in the 4 2 -nucleotide 
TCR27 repeat sequence. The resulting mixture of DNA 
fragments containing vari£a>le numbers of repeats was then 
cloned into p6£X-2T which had been digested previously 
with Smal and BamHI. After cloning of the resulting 

25 recombinant plasmids, the sizes of their inserts were 
determined by BainHI-ircoRI digestion and electrophoresis. 
A plasmid containing a ^^850 nucleotide insert, designated 
pGEX-2T-Agl5, was selected for further evaluation. The 
presence at the upstream end of this insert of the 5» 

30 nonrepetitive segment of the TCR27 coding region and the 
4 2 -nucleotide repeats at its 3 • terminus was confirmed by 
DNA sequencing, as was the in-frame positioning of the 
region that encodes the recombinant protein. When Agl5 
was produced in E. coll as described below, a protein of 

35 the expected size was present in a Coomassie blue-stained 
gel, and this protein reacted with an anti-TCR27 repeat 
serum in a Western blot. This latter serum was produced 
by iimnunizing a rabbit with a synthetic peptide 
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consisting of two 14-amino acid TCR27 repeats. 
Plasmid encoding Aa44, 

Beginning with pTCR27-7 DNA (see Agl5 above) a BaitiBI- 
EcoRl fragment was isolated and subjected to partial 
5 digestion with PvuII and fragments '0.5-0.75 kilobases 
were isolated from the resulting mixture. This mixture 
of fragments was then treated with ligase to generate 
BamHI-^EcoRI fragments similar to the native TCR27 coding 
region, but with far fewer repeats in their central 

10 regions. The resulting fragments were then cloned into 
P6EX-2T previously digested with BamHI and BcoRI. The 
sizes of the inserts in the resulting recombinant 
plasmids were determined by Bai&HI and EcdRL digestion and 
electrophoresis, and one containing a 'l.l kilobase 

15 insert, designated p6EX-2T*Ag44, was selected for further 
evaluation. The presence at the upstream end of this 
insert of the 5* nonrepetitive segment of the TCR27 
coding region and the 3' nonrepetitive segment at its 3' 
terminus, as well as the presence of an intervening 

20 region of repeats, was confirmed by DNA sequencing. In 
addition, the in-frame positioning of the 5* end of the 
coding region of the construct was confirmed by this 
approach. When Ag44 was produced in coll as described 
below, a protein of the expected size was present in a 

25 Coomassie blue-stained gel, and this protein reacted with 
the anti-TCR27 repeat serxun in a Western blot. 
Plasmid encoding Aa2-2. 

pG£X-2T-Ag44 DNA was digested to completion with 
BaitiHI and PvuII, and fragments '350 nucleotides in length 

30 were cloned into p6EX-2T previously digested with BaoHI 
and Smal. The presence in one of the resulting plasmids 
of the 5' nonrepetitive coding region of the TCR27 gene 
was confirmed by DNA sequencing, as was a lack of repeats 
and the in-frame positioning of the insert. As with the 

35 other recombinant antigens, an appropriately sized 
protein was produced In E. coli. 
Plasmid encoding Ag4. 

p6EX-2T-Ag44 DNA was digested to completion with 
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PvuXX and EcdRl, and fragments *'350 nucleotides in length 
were cloned into p6EX-l previously digested with 5iDaI and 
ZcoRI. The presence in one of the resulting plasmids of 
the 3* nonrepetitive coding region of the TCR27 gene was 
5 confirmed by DNA sequencing, as was a lack of repeats and 
the in-frame positioning of the insert. As with the 
other recombinant antigens, an appropriately sized 
protein was produced in E. coll. 
Plasmid encoding Aafl. 

10 An EcdRL-HlncXX fragment of the TCR27 cDNA was cloned 

into pBluescript SK that had been previously digested 
with these two endonucleases. The resulting recombinant 
plasmid was linearized with Hindi and then digested with 
Bal 31 with the purpose removing the 3» nonrepetitive 

15 region while leaving a region of repeats. A fragment 
obtained by this approach was shown to have a segment 
containing "700 nucleotides of repetitive sequence and 
was cloned into pBluescript. The presence of repeats at 
both ends of this insert was confirmed by DNA sequencing, 

20 The insert, as a BajtfHI-irincII fragment, was then excised 
from pBluescript and cloned into the BaitfHI-Smal site of 
P6EX-3X. When Ag8 was produced in E. coli a protein of 
the expected size was seen in a Coomassie blue stained 
gel, and this protein reacted with antibodies in the 

25 antl-TCR27 repeat serum. 

Example 6. Expression in E. coli and Purification of 
Recombinant Polypeptides 

For the production of recombinant polypeptides, 
E. coli DH5f transformed with pGEX bearing a TCR27 coding 

30 segment, was grown oveimlght at 37 ^^C in liquid LB medium 
containing 100 ^g/nl amplclllln. One-tenth volume of 
this culture was then Inoculated into approximately 80 ml 
fresh LB/amp medivim, and after incubation for 1 hour, 
isopropyl-B-D-thiogalactopyranoside was added to a 

35 concentration of O.l mH and the culture was further 
incubated for 3-7 hours at 37»C. The culture was then 
centrifuged at 3,000 x g for 15 minutes at 4*c, and after 
aspiration of the supernatant the pellet was suspended to 



W09S«5797 PCT/US9S/03191 

- 21 - 

2.5 ml in phosphate buffered saline (PBS) containing 1% 
Triton X-100 and 1.6 nH phenylmethylsulfonyl fluoride to 
inhibit proteolysis. The cell suspension was sonicated 
until it became bubbly and then centrifuged at 10,000 x g 
5 for 10 minutes. 

Partial purification of the recombinant polypeptides 
was accomplished by mixing the above supernatant with 
200^1 of 50% glutathione-agarose beads (Sigma, St. Louis, 
MO) suspended in PBS and incubating at room temperature 

10 for 1 hour with gentle shaking. The beads were then 
washed 2 times with 0.5% Triton X-lOO and 1.6 mM 
phenylmethylsulfonyl fluoride in PBS, followed by a 
single wash with PBS. To remove the recombinant protein 
from the beads, 200 /il of 10 mH glutathione in 50 mM 

15 Tris-HCl, pH 8 was added and incubated for 10 minutes at 
room temperattire with gentle shaking, and the beads are 
pelleted in a microcentrifuge. This procedure was 
repeated once and the supematants obtained were 
combined, after which the protein concentration was 

20 determined using a protein assay kit (Bio-Rad, Richmond, 
CA) . 

Eseample 1. ELI8A for Detecting T. czuzl Infeetion 

To test blood saa^les for antibodies that bind 
specifically to the recombinant T. cruzi antigens, the 

25 following procedxire was employed. After pturif ication on 
glutathione agarose, the recombinant antigen was diluted 
in PBS to a concentration of 5 ug/ml (500 ng/lOO /xl) • 
One hundred microliters of the diluted antigen solution 
was added to each well of a 96-well Immulon 1 plate 

30 (Dynatech Laboratories, Chantilly, VA) , and the plate was 
then incubated for 1 hour at room tenqperature, or 
overnight at 4®C, and washed 3 times with 0.05% Tween 20 
in PBS. Blocking to reduce nonspecific binding of 
antibodies was accomplished by adding to each well 200 /il 

35 of a 1% solution of bovine serum albumin in PBS /Tween 20 
and incubation for l hour. After aspiration of the 
blocking solution, 100 /il of the primary antibody 
solution (anticoagulated whole blood, plasma, or serum) , 
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diluted in the range of 1/16 to 1/2048 in blocking 
solution, was added and incubated for 1 hour at room 
temperature or overnight at 4«C. fChe wells were then 
washed 3 times, and 100 fil of goat anti-human IgG 
5 emtibody conjugated to horseradish peroxidase (Organon 
Teknika, Durham, NC) , diluted 1/500 or 1/1000 in 
PBS/Tween 20, 100 ^1 of o-phenylenediamine 
dihydrochloride (OPD, Sigma) solution was added to each 
well and incubated for 5-15 minutes. The OPD solution 
10 was prepared by dissolving a 5 mg OPD tablet in 50 ml 1% 
methanol in Hp and adding 50 /il 30% HjOj immediately 
before use. The reaction was stopped by adding 25 1 of 
4M H2S04. Absorbances were read at 490 nm in a micr opiate 
reader (Bio-Rad) . 
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AT6 TCC CCT ATA CTA GGT TAT TGG AAA ATT AAG GGC CTT GTG CAA CCC 48 
Met Ser Pro lie Leu Gly Tyr Trp Lys lie Lye Gly Leu Val Gin Pro 
15 10 15 



ACT C6A CTT CTT TTG GAA TAT CTT 6AA 6AA AAA TAT 6AA GAG CAT TTG 96 
Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lye Tyr Glu Glu His Leu 
20 25 30 
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TAT GAG 06C GAT GAA GGT GAT AAA TGG OGA AAC AAA AAG TTT GAA TTG 144 
Tyr Glu Arg Asp Glu Gly Asp Lye Trp Arg Asn Lye Lye Phe Glu Leu 
35 40 45 

GGT TTG GAG TTT CCC AAT CTT CCT TAT TAT ATT GAT GGT GAT GTT AAA 192 
Gly Leu Glu Phe Pro Aen Leu Pro Tyr Tyr He Asp Gly Asp Val Lys 
50 55 60 

TTA ACA CAG TCT ATG 6CC ATC ATA CGT TAT ATA GCT GAC AAG GAG AAC 240 
Leu Thr Gin Ser Met Ala He lie Arg Tyr He Ala Asp Lys His Asn 
65 70 75 80 

ATG TTG GGT GGT TGT CCA AAA GAG CGT GGA GAG ATT TCA ATG CTT GAA 288 
Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu 
85 90 95 

GGA GCG GTT TTG GAT ATT AGA TAC GGT GTT TOG AGA ATT GCA TAT AGT 336 
Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

AAA GAC TTT GAA ACT CTC AAA GTT GAT TTT CTT AGC AAG CTA CCT GAA 384 
Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu 
115 120 125 

ATG CTG AAA ATG TTC GAA GAT CGT TTA TGT CAT AAA ACA TAT TTA AAT 432 
Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn 
130 135 140 

GGT GAT CAT 6TA ACC CAT CCT GAC TTC ATG TTO TAT GAC GCT CTT GAT 480 
Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
145 150 155 160 

GTT GTT TTA TAC ATG GAC OCA ATG TGC CTG GAT GCG TTC CCA AAA TTA 528 
Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu 
165 170 175 

GTT TGT TTT AAA AAA CGT ATT GAA GCT ATC CCA CAA ATT GAT AAG TAC 576' 
Val Cys Phe Lys Lys Arg He Glu Ala He Pro Gin He Asp Lys Tyr 
180 185 190 

TTG AAA TCC AGC AAG TAT ATA GCA TGG CCT TTG CAG GGC TGG CAA GCC 624 
Leu Lys Ser Ser Lys Tyr He Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

ACG TTT GGT GGT GGC GAC CAT CCT CCA AAA TCG GAT CTG GTT CC6 CGT 672 
Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Leu Val Pro Arg 
210 215 220 

GGA TCC CCG TCC CAG CTC CAA CAG GCA GAA AAT AAT ATC ACT AAT TCC 720 
Gly Ser Pro Ser Gin Leu Gin Gin Ala Glu Asn Asn He Thr Asn Ser 
225 230 235 240 

AAA AAA GAA ATG ACA AAG CTA CGA GAA AAA GTG AAA AAG GCC GAG AAA 768 
Lys Lys Glu Met Thr Lys Leu Arg Glu Lys Val Lys Lys Ala Glu Lys 
245 250 255 

GAA AAA TTG GAC GCC ATT AAC CGG GCA ACC AAG CTG GAA GAG GAA CGA 816 
Glu Lys Leu Asp Ala He Asn Arg Ala Thr Lys Leu Glu Glu Glu Arg 
260 265 270 

AAC CAA GCG TAC AAA GCA GCA CAC AAG GCA GAG GAG GAA AAG GCT AAA 864 
Asn Gin Ala Tyr Lys Ala Ala His Lys Ala Glu Glu Glu Lys Ala Lys 
275 280 285 

ACA TTT CAA OGC CTT ATA ACA TTT GAG TCG GAA AAT ATT AAC TTA AAG 912 
Thr Phe Gin Arg Leu He Thr Phe Glu Ser Glu Asn He Asn Leu Lys 
290 295 300 
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AAA AGO CCA AAT 6AC 6GA GTT TCA AAT C66 GAT AAG AAA AAA AAT TOT 960 
Lye Arg Pro Abu Asp Ala Val Ser Asn Arg Asp Lys Lys Lys Asn Ser 
305 310 315 320 

6AA ACC 6CA AAA ACT GAC GAA 6TA GAG AAA GAG AGG GCG GCT GAG GOT 1008 
Glu Thr Ala Lys Thr Asp Glu Val Glu Lys Gin Arg Ala Ala Glu Ala 
325 330 335 

GCC AAG GCC GTG GAG AOG GAG AAG CAG AGG GGA GCT GAG GCC A06 AAG 1056 
Ala Lys Ala Val Glu Thr Glu Lys Gin Arg Ala Ala Glu Ala Thr Lys 
340 345 350 

GTT GCC GAA GCG GAG AAG C6G AAG GCA GCT GAG GCC GCC AAG GCC GTG 1104 
Val Ala Glu Ala Glu Lys Arg Lys Ala Ala Glu Ala Ala Lys Ala Val 
355 360 365 

GAG ACG GAG AAG CAG AGG GCA GCT GAA GCC AC6 AAG GTT GCC GAA GCG 1152 
Glu Thr Glu Lys Gin Arg Ala Ala Glu Ala Thr Lys Val Ala Glu Ala 
370 375 380 

GAG AAG CAG AAG GCA GCT GAG GCC GCC AAG GCC GTG GAG ACG GAG AAG 1200 
Glu Lys Gin Lys Ala Ala Glu Ala Ala Lys Ala Val Glu Thr Glu Lys 
385 390 395 400 

CAG AGG GCA GCT GAA GCC ACG AAG GTT GCC GAA GCG GAG AAG CAG AGG 1248 
Gin Arg Ala Ala Glu Ala Thr I*ys Val Ala Glu Ala Glu Lys Gin Arg 
405 410 415 

GCA GCT GAA GCC ATG AAG GTT GCC GAA GCG GAG AAG CAG AAG GCA GCT 1296 
Ala Ala Glu Ala Met Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala 
420 425 430 

GAG GCC AOG AAG GTT GCC GAA GOG GAG AAG CAG AAG GCA GCT GAA GOC 1344 
Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala 
435 440 445 

ACG AAG GTT GCC GAA GOG GAG AAG CAG AAG GGA GCT GAA GCC AOG AAG 1392 
Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys 
450 455 460 

GTT GCC GAA GOG GAG AAG CAG AAG GCA GCT GAA GCC ACG AAG GTT GCC 1440 
Val Ala Glu Ala Glu Lys Gin Lya Ala Ala Glu Ala Thr Lys Val Ala 
465 470 475 480 

GAA GCG GAG AAG CAG AAG GCA GCT GAA GOC AOG AAG GTT GCC GAA GOG 1488 
Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala 
485 490 495 

OAG AAG CAG AAG GCA GCT GAA GCC AOG AAG GTT GCC GAA GOG GAG AAG 1536 
Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys 
500 505 510 

CAG AAG GCA GCT GAA GCC AOG AAG GTT GCC GAA GCG GAG AAG CAG AAG 1584 
Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys 

515 520 525 

GCA GCT GAA GCC ACG AAG GTT GCC GAA GCG GAG AAG CAG AAG GCA GCT 1632 
Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala 
530 535 540 

GAA GCC ACG AAG GTT GCC GAA GOG GAG AAG CAG AAG GCA GGG GAA TTC 1680 
Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Gly Glu Phe 
545 550 555 560 

ATC GTG ACT GAC TGA 1695 
lie Val Thr Asp 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEN GTH; 564 amino acids 

(B) T7PE: amino acid 
(D) TOPmOGYs linear 

(ii) MOZiBCOLE TSTPEs protein 

(Xi) SEQUENCE DESCRIPTIONt SEQ ID NOs2s 

Met Ser Fro lie Leu Gly Tyr Trp Lys He Lye Gly Leu Val Gin Pro 
1*5 10 15 

Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu 
20 25 30 

Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu 
35 40 45 

Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr He Asp Gly Asp Val Lys 
50 55 60 

Leu Thr Gin Ser Met Ala He He Arg Tyr He Ala Asp Lys His Asn 
65 70 75 80 

Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Olu 
85 90 95 

Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lya Leu Pro Glu 
115 120 125 

Mat Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn 
130 135 140 

Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
145 150 155 160 

Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu 
165 170 175 

Val Cys Phe Lys Lys Arg He Glu Ala He Pro Gin He Asp Lys Tyr 
180 185 190 

Leu Lys Ser Ser Lys Tyr He Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Leu Val Pro Arg 
210 215 220 

Gly Ser Pro Ser Gin Leu Gin Gin Ala Glu Asn Asn He Thr Asn Ser 
225 230 235 240 

Lys Lys Glu Met Thr Lys Leu Arg Glu Lys Val Lys Lys Ala Glu Lys 
245 250 255 

Glu Lys Leu Asp Ala He Asn Arg Ala Thr Lys Leu Glu Glu Glu Arg 
260 265 270 

Asn Gin Ala Tyr Lys Ala Ala His Lys Ala Glu Glu Glu Lys Ala Lys 
275 280 285 

Thr Phe Gin Arg Leu He Thr Phe Glu Ser Glu Asn He Asn Leu Lys 
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290 



295 



300 



Lys Arg 
305 



Pro 



Asn Asp Ala Val 
310 



Ser 



Asn Arg Asp 
315 



Lys 



Lys Lys Asn Ser 
320 



Glu Thr Ala Lys Thr Asp Olu Val Glu Lys Gin Arg Ala Ala Glu Ala 
325 330 335 

Ala Lys Ala Val Olu Thr Glu Lys Gin Arg Ala Ala Glu Ala Thr Lys 
340 345 350 

Val Ala Glu Ala Glu Lys Arg Lys Ala Ala Glu Ala Ala Lys Ala Val 
355 360 365 

Glu Thr Glu Lys Gin Arg Ala Ala Glu Ala Thr Lys Val Ala Glu Ala 
370 375 380 

Glu Lys Gin Lys Ala Ala Glu Ala Ala Lys Ala Val Glu Thr Glu Lys 
385 390 395 400 

Gin Arg Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Arg 
405 410 415 

Ala Ala Glu Ala Met Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala 
420 425 430 

Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala 
435 440 445 

Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys 
450 455 460 

Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala 
465 470 475 480 

Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala 
485 490 495 

Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys 
500 505 510 

Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys 
515 520 525 

Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala 
530 535 540 

Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Gly Glu Phe 
545 550 555 560 

lie Val Thr Asp 



(2) INFOBMATION FOR SEQ ZD N0l3s 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1065 base pairs 

(B) TYPE: nucleic acid 

(C) STRAMDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1062 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATG TCC OCT ATA CTA G6T TAT TG6 AAA ATT AA6 G6C CTT GTG CAA CCC 48 
Het Ser Pro lie Leu Gly Tyr Trp Lys He Lye 61y Leu Val 61n Pro 
15 10 15 

ACT C6A CTT CTT TTG GAA TAT CTT GAA 6AA AAA TAT GAA GAG CAT TT6 96 
Thr Arg Leu Leu Leu 61u Tyr Leu Glu Glu Lys Tyr Glu Glu Bis Leu 
20 25 30 

TAT GAG CGC GAT GAA GGT GAT AAA TGG OGA AAC AAA AA6 TTT GAA TTG 144 
Tyr Glu Arg Asp Glu Gly Aap Lys Trp Arg Asn Lys Lys Phe Glu Leu 
35 40 45 

GGT TTG GAG TTT CCC AAT CTT OCT TAT TAT ATT GAT GGT GAT GTT AAA 192 
Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr He Asp Gly Asp Val Lys 
50 55 60 

TTA ACA CAG TCT ATG GGC ATC ATA OCT TAT ATA OCT GAC AAG CAC AAC 240 
Leu Thr Gin Ser Met Ala He He Arg Tyr He Ala Asp Lys Bis Asn 
65 70 75 80 

ATG TTG GGT GGT TGT CCA AAA GAG OGT GCA GAG ATT TCA ATG CTT GAA 288 
Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu 
85 90 95 

GGA GCG GTT TTG GAT ATT AGA TAG GGT GTT TOG AGA ATT GCA TAT AGT 336 
Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

AAA GAC TTT GAA ACT CTC AAA GTT GAT TTT CTT AGO AAG CTA OCT GAA 384 
Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu 
115 120 125 

ATG CTG AAA ATG TTC GAA GAT C6T TTA TGT CAT AAA ACA TAT TTA AAT 432 
Met Leu Lys Met Phe Glu Asp Arg Leu Cys Bis Lys Thr Tyr Leu Asn 
130 135 140 

GGT GAT CAT GTA ACC CAT OCT GAC TTC ATG TTG TAT GAC GCT CTT GAT 480 
Gly Asp Bis Val Thr Bis Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
145 150 155 160 

GTT GTT TTA TAG ATG GAC CCA ATG T6C CTG GAT GCG TTC CCA AAA TTA 528 
Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu 
165 170 175 

GTT TGT TTT AAA AAA COT ATT GAA GCT ATC CCA CAA ATT GAT AAG TAC 576 
Val Cys Phe Lys Lys Arg He Glu Ala He Pro Gin He Asp Lys Tyr 
180 185 190 

TTG AAA TCC A6C AAG TAT ATA GCA TGG CCT TTG CAG 660 T66 CAA GCC 624 
Leu Lys Ser Ser Lys Tyr He Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

AC6 TTT GGT GGT GGC GAC CAT CCT CCA AAA TCG GAT CTG GTT CC6 CGT 672 
Thr Phe Gly Gly Gly Asp Bis Pro Pro Lys Ser Asp Leu Val Pro Arg 
210 215 220 

GGA TCC CCG TCC CAG CTC CAA CAG GCA GAA AAT AAT ATC ACT AAT TCC 720 
Gly Ser Pro Ser Gin Leu Gin Gin Ala Glu Asn Asn He Thr Asn Ser 
225 230 235 240 

AAA AAA GAA ATG ACA AAG CTA CGA GAA AAA GTG AAA AAG GCC GAG AAA 768 
Lys Lys Glu Met Thr Lys Leu Arg Glu Lys Val Lys Lys Ala Glu Lys 
245 250 255 
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GAA AAA TT6 6AC GCC ATT AAC CG6 6CA ACC AA6 CTG 6AA GAG GAA 06A 816 
Glu Lye Leu Asp Ala He Asn Arg Ala Thr Lys Leu Glu Glu Glu Arg 
260 265 270 

AAC CAA 6CG TAG AAA GCA GCA CAC AAG GCA GAG GAG GAA AAG GCT AAA 864 
Asn Gin Ala Tyr Lys Ala Ala His Lys Ala Glu Glu Glu Lys Ala Lys 
275 280 285 

ACA TTT CAA CGC CTT ATA ACA TTT GAG TCG GAA AAT ATT AAC TTA AAG 912 
Thr Phe Gin Arg Leu He Thr Phe Glu Ser Glu Asn He Asn Leu Lys 
290 295 300 

AAA A66 CCA AAT 6AC GCA 6TT TCA AAT CGG GAT AAG AAA AAA AAT TCT 960 
Lys Arg Pro Asn Asp Ala Val Ser Asn Arg Asp Lys Lys Lys Asn Ser 
305 310 315 320 

GAA ACC GCA AAA ACT GAC GAA 6TA GAG AAA CAG AGG GCG GCT GAG GCT 1008 
Glu Thr Ala Lys Thr Asp Glu Val Glu Lye Gin Arg Ala Ala Glu Ala 
325 330 335 

GCC AAG GCC 6TG GAG ACG GAG AAG CAG AGG GCA GGG GAA TTC ATC GT6 1056 
Ala Lys Ala Val Glu Thr Glu Lys Gin Arg Ala Gly Glu Phe He Val 
340 345 350 

ACT GAC TGA 1065 
Thr Asp 



(2) INFORMATION FOR SEQ ID NOs4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 354 amino acids 

(B) TSTPB: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECOLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Ser Pro He Leu Gly Tyr Trp Lys He Lys Gly Leu Val Gin Pro 
15 10 15 

Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu 
20 25 30 

Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu 
35 40 45 

Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr He Asp Gly Asp Val Lys 
50 55 60 

I«eu Thr Gin Ser Met Ala He He Arg Tyr He Ala Asp Lys His Asn 
65 70 75 80 

Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu 
85 90 95 

Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu 
115 120 • 125 

Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Aan 
130 135 140 
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Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
145 150 155 160 

Val Val Leu Tyr Met Asp Pro Met Cyo Leu Asp Ala Phe Pro Lys Leu 
165 170 175 

Val Cys Phe Lys Lys Arg lie Glu Ala lie Pro Gin lie Asp Lys Tyr 
180 185 190 

Leu Lys Ser Ser Lys Tyr He Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Leu Val Pro Arg 
210 215 . 220 



Gly Ser Pro Ser Gin Leu Gin Gin Ala Glu Asn Asn He Thr Asn Ser 
225 230 235 240 

Lys Lys Glu Met Thr Lys Leu Arg Glu Lys Val Lys Lys Ala Glu Lys 
245 250 255 

Glu Lys Leu Asp Ala He Asn Arg Ala Thr Lys Leu Glu Glu Glu Arg 
260 265 270 

Asn Gin Ala Tyr Lys Ala Ala His Lys Ala Glu Glu Glu Lys Ala Lys 
275 280 285 

Thr Phe Gin Arg Leu He Thr Phe Glu Ser Glu Asn He Asn Leu Lys 
290 295 300 

Lys Arg Pro Asn Asp Ala Val Ser Asn Arg Asp Lys Lys Lys Asn Ser 
305 310 315 320 

Glu Thr Ala Lys Thr Asp Glu Val Glu Lys Gin Arg Ala Ala Glu Ala 
325 330 335 

Ala Lys Ala Val Glu Thr Glu Lys Gin Arg Ala Gly Glu Phe He Val 
340 345 350 

Thr Asp 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 924 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..921 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATG TCC CCT ATA CTA GGT TAT TGG AAA ATT AA6 G6C CTT GTG CAA CCC 48 
Met Ser Pro He Leu Gly Tyr Trp Lys He Lys Gly Leu Val Gin Pro 
1 5 10 15 

ACT CGA CTT CTT TTG GAA TAT CTT 6AA GAA AAA TAT GAA GAG CAT TTG 96 
Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu 
20 25 30 
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TAT GAG CGC GAT 6AA GGT GAT AAA TGG CGA AAC AAA AA6 TTT GAA TTG 144 
Tyr Glu Arg Asp Glu 61y Asp Lya Trp Arg Aan hys Lye Phe Glu Leu 
35 40 45 

GGT TTG GAG TTT CCC AAT CTT OCT TAT TAT ATT GAT GGT GAT 6TT AAA 192 
Gly Leu Glu Phe Pro Aan Leu Pro Tyr Tyr lie Asp Gly Asp Val Lye 
50 55 60 

TTA ACA CAG TCT ATG GCC ATC ATA OGT TAT ATA GCT GAC AAG CAC AAC 240 
Leu Thr Gin Ser Met Ala lie lie Arg Tyr lie Ala Aap Lye His Asn 
65 70 75 80 

ATG TTG GGT GGT TGT CCA AAA GAG C6T GCA GAG ATT TCA ATG CTT GAA 288 
Met Leu Gly Gly Cye Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu 
85 90 95 

GGA 6CG GTT TTG GAT ATT AGA TAC GGT GTT TCG AGA ATT GCA TAT ACT 336 
Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

AAA GAC TTT GAA ACT CTC AAA GTT GAT TTT CTT AGC AAG CTA CCT GAA 384 
Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu 
115 120 125 

ATG CTG AAA ATG TTC GAA GAT C6T TTA TGT CAT AAA ACA TAT TTA AAT 432 
Met Leu Lys Met Phe Glu Asp Arg Leu Cys Bis Lys Thr Tyr Leu Asn 
130 135 140 

GGT GAT CAT 6XA ACC CAT CCT GAC TTC ATG TTG TAT GAC GCT CTT GAT 480 
Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
145 150 155 160 

GTT GTT TTA TAC ATG GAC CCA ATG TGC CTG GAT GCG TTC CCA AAA TTA 528 
Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu 
165 170 175 

GTT TGT TTT AAA AAA OGT ATT GAA GCT ATC CCA CAA ATT GAT AAG TAC 576 
Val Cys Phe Lys Lys Arg He Glu Ala He Pro Gin He Asp Lys Tyr 
180 185 190 

TTG AAA TCC AGC AAG TAT ATA GCA TGG CCT TTG CAG GGC TGG CAA GCC 624 
Leu Lys Ser Ser Lys Tyr He Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

AC6 TTT GGT GGT GGC GAC CAT CCT CGA AAA TOG GAT CCC CCT GAA GCT 672 
Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Pro Pro Glu Ala 
210 215 220 

GCC AAG GCT ATG GAG TCG CAG AAG CAG AGA TTC TTA GAA CGT TTT OGG 720 
Ala Lys Ala Met Glu Ser Gin Lys Gin Arg Phe Leu Glu Arg Phe Ala 
225 230 235 240 

GTT CTT GAG GAG GAG AAA AAG GCA GCC TTA AGA GOG GCG GAG ATG GAG 768 
Val Leu Glu Glu Glu Lys Lys Ala Ala Leu Arg Ala Ala Glu Met Glu 
245 250 255 

AG6 AGG AAA ATA ACA AAC ATA ATG AAG AAT AAA GGT GTA CGC AGT TCG 816 
Arg Arg Lys He Thr Asn He Met Lys Asn Lys Gly Val Arg Ser Ser 
260 265 270 

GAT TCG GTG COG CTT GTG GAG GGG AAT CGC TCT GTT ACT GAG AGT TCT 864 
Asp Ser Val Pro Leu Val Glu Gly Asn Arg Ser Val Thr Glu Ser Ser 
275 280 285 

TGT AGA AAT OGG TTT OGT TTT TGT AGA AAT OGG TTT CGT TTT TCA TGT 912 
Cys Arg Asn Arg Phe Arg Phe Cys Arg Asn Arg Phe Arg Phe Ser Cys 
290 295 300 
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TCT GTA ATG TCA 924 

Ser Val Met 

305 

(2) INFORIOVTION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 307 amino acids 

(B) TTFEx amino acid 
(D) TOPOLOOT: linear 

(ii) HQLECDIiE TTFEt protein 

(xi) SEQaBNCB DESCRIPTION: SEQ ID NOtSs 

Met Ser Pro lie Leu Gly Tyr Trp Lys lie Lys Gly lieu Val Gin Pro 
15 10 IS 

Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu 
20 25 30 

Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu 
35 40 45 

Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr He Asp Gly Asp Val Lys 
50 55 60 

Leu Thr Gin Ser Met Ala He He Arg Tyr He Ala Asp Lys Bis Asn 
65 70 75 80 

Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu 
85 90 95 

Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lya Leu Pro Glu 
115 120 125 

Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn 
130 135 140 

Gly Asp Bis Val Thr Bis Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 

145 ISO 155 160 

Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu 
165 170 175 

Val Cys Phe Lys Lys Arg He Glu Ala He Pro Gin He Asp Lys Tyr 
180 185 190 

Leu Lys Ser Ser Lys Tyr He Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Pro Pro Glu Ala 
210 215 220 

Ala Lys Ala Met Glu Ser Gin Lys Gin Arg Phe Leu Glu Arg Phe Ala 
225 230 235 240 

Val Leu Glu Glu Glu Lys Lys Ala Ala Leu Arg Ala Ala Glu Met Glu 
245 250 255 

Arg Arg Lys He Thr Asn He Met Lys Asn Lys Gly Val Arg Ser Ser 
260 265 270 
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Asp Ser Val Pro Leu Val 61u Gly Aen Arg Ser Val Thr Glu Ser Ser 
275 280 285 

Cys Arg Asn Arg Phe Arg Phe Cys Arg Asn Arg Phe Arg Phe Ser Cys 
290 295 300 

Ser Val Met 
305 

(2) IHFORMATION FOR SBQ ID NO}7x 

(i) SEQUENCE CHARACTERISTZCS: 

(A) LENGTH: 1932 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNES8: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1929 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

AT6 TCC CCT ATA CTA GGT TAT TOG AAA ATT AAG GGC CTT GTG CAA CCC 48 
Met Ser Pro lie Leu Gly Tyr Trp Lys lie Lys Gly Leu Val Gin Pro 
15 10 15 

ACT CG|k CTT CTT TT6 GAA TAT CTT 6AA GAA AAA TAT GAA GAG CAT TTG 96 
Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu 
20 25 30 

TAT GAG OGC GAT GAA GGT GAT AAA TGG OGA AAC AAA AAG TTT GAA TTG 144. 
Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu 
35 40 45 

GGT TTG GAG TTT CCC AAT CTT CCT TAT TAT ATT GAT GGT GAT GTT AAA 192 ' 

Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr lie Asp Gly Asp Val Lys 
50 55 60 

TTA ACA GAG TCT ATG GOC ATC ATA OGT TAT ATA GCT 6AC AAG CAC AAC 240 
Leu Thr Gin Ser Met Ala He He Arg Tyr He Ala Asp Lys His Asn 
65 70 75 80 

ATG TTG GGT GGT TGT CCA AAA GAG CGT GCA GAG ATT TCA ATG CTT GAA 288 
Mat Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu 
85 90 95 

66A GCG GTT TTG GAT ATT A6A TAG GGT GTT TC6 AGA ATT GCA TAT AGT 336 
Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

AAA 6AC TTT GAA ACT CTC AAA GTT GAT TTT CTT AGO AAG CTA CCT GAA 384 
Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu 
115 120 125 

ATG CTG AAA ATG TTC GAA GAT OGT TTA TGT CAT AAA ACA TAT TTA AAT 432 
Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn 
130 135 140 

GGT GAT CAT GTA ACC CAT CCT GAC TTC ATG TTG TAT GAC GCT CTT GAT 480 
Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
145 150 155 160 
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6TT 6TT TEA lAC ATG GAC CCh A36 T6C CIG GAT GOG TTC CCA AAA RA 528 
Val Val Leu Tyr Met Asp Pro Met cya Leu Asp Ala Phe Pro Lya Leu 
165 170 175 

6TT TGT TTT AAA AAA C6T ATT GAA GCT ATC CCA CAA ATT GAT AA6 TAC 576 
Val Cya Phe Lye Lya Arg lie Glu Ala lie Pro Gin lie Aap Lya Tyr 
180 185 190 

TTG AAA TCC AGO AAG TAT ATA GCA TGG CCT TTG CAG GGC TOG CAA GCC 624 
Leu Lya Ser Ser Lya Tyr lie Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

AC6 TTT GGT GGT GGC GAC CAT CCT CCA AAA TCG GAT CTG GTT COG CQT 672 
Thr Phe Gly Gly Gly Aap Hie Pro Pro Lya Ser Aap Leu Val Pro Aro 
210 215 220 

6GA TCC COG TCC CAG CTC CAA CAG GCA GAA AAT AAT ATC ACT AAT TCC 720 
Gly Ser Pro Ser Gin Leu Gin Gin Ala Glu Aan Aan lie Thr Aan Ser 
225 230 235 240 

AAA AAA GAA ATG ACA AAG CTA 06A GAA AAA GTG AAA AAG GCC GAG AAA 768 
Lya Lya Glu Met Thr Lya Leu Arg Glu Lya Val Lya Lya Ala Glu Lya 
245 250 255 

GAA AAA TTG GAC GCC ATT AAC 06G GCA AOC AAG CTG GAA GAG GAA 06A 816 
Glu Lya Leu Aap Ala He Aan Arg Ala Thr Lya Leu Glu Glu Glu Aro 
260 265 270 

AAC CAA GOG TAC AAA GCA GCA CAC AAG GCA GAG GAG GAA AAG GCT AAA 864 
Aan Gin Ala Tyr Lya Ala Ala Hia Lya Ala Glu Glu Glu Lya Ala Lya 
275 280 285 

ACA TTT CAA C6C CTT ATA ACA TTT GAG TOG GAA AAT ATT AAC TTA AAG 912 
Thr Phe Gin Arg Leu He Thr Phe Glu Ser Glu Asn He Aan Leu Lya 
290 295 300 

AAA A6G CCA AAT GAC GCA GTT TCA AAT CGG GAT AAG AAA AAA AAT TOT 960' 
Lys Arg Pro Aan Aap Ala Val Ser Aan Arg Aap Lya Lya Lya Aan Ser 
305 310 315 320 

GAA ACC GCA AAA ACT GAC GAA GTA GAG AAA CAG A6G GOG GCT GAG GCT 1008 
Glu Thr Ala Lya Thr Aap Glu Val Glu Lya Gin Arg Ala Ala Glu Ala 
325 330 335 

GOO AAG GCC GTG GAG AOG GAG AAG CAG AGG GCA GCT GAG GCC AOG AAG 1056 
Ala Lya Ala Val Glu Thr Glu Lya Gin Arg Ala Ala Glu Ala Thr Lya 
340 345 350 

GTT GCC GAA GCG GAG AAG CGG AAG GCA GCT GAG GCC GCC AAG GCC GTG 1104 
Val Ala Glu Ala Glu Lya Arg Lya Ala Ala Glu Ala Ala Lya Ala Val 
355 360 365 

GAG AC6 GAG AAG CAG AGG GCA GCT GAA GCC AOG AAtT GTT GCC GAA GCG 1152 
Glu Thr Glu Lya Gin Arg Ala Ala Glu Ala Thr Lya Val Ala Glu Ala 
370 375 380 

GAG AAG CAG AAG GCA GCT GAG GCC GCC AAG GCC GTG GAG ACG GAG AAG 1200 
Glu Lya Gin Lya Ala Ala Glu Ala Ala Lya Ala Val Glu Thr Glu Lya 
385 390 395 400 

CAG AGG GCA GCT GAA GCC AOG AAG GTT GCC GAA GOG GAG AAG CAG AGG 1248 
Gin Arg Ala Ala Glu Ala Thr Lya Val Ala Glu Ala Glu Lya Gin Arg 
405 410 415 

GCA GCT GAA GOO ATG AAG GTT GOO GAA GOG GAG AAG CAG AAG GCA GCT 1296 
Ala Ala Glu Ala Met Lya Val Ala Glu Ala Glu Lya Gin Lya Ala Ala 
420 425 430 
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GAG GCC AOG AAG GTT GCC GAA GCG GAG AAG GAG AAG GCA GOT GAA GCC 1344 
Glu Ala Thr Lys Val Ala Glu Ala Glu Ly^ Gin Lys Ala Ala Glu Ala 
435 440 445 

ACG AAG GTT GCC GAA GCG GAG AAG CAG AAG GCA GCT GAA GCC ACG AAG 1392 
Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys 
450 455 460 

GTT GCC GAA GCG GAG AAG CAG AAG GCA GCT GAA GCC ACG AAG GTT GCC 1440 
Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala 
465 470 475 480 

GAA GOG GAG AAG CAG AAG GCA GCT GAA GCC ACG AAG GTT GCC GAA GCG 1488 
Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala 
485 490 495 

GAG AAG CAG AAG GCA GCT GAA GCC AGO AAG GTT GCC GAA GCG GAG AAG 1536 
Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys 
500 505 510 

CAG AAG GCA GCT GAA GCC ACG AAG GTT GCC GAA GCG GAG AAG CAG AAG 1584 
Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys 
515 520 525 

GCA GCT GAA GCC ACG AAG GTT GCC GAA GCG GAG AAG CAG AAG GCA GCT 1632 
Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala 
530 535 540 

GAA GCC AOG AAG GTT GCC GAA GCG GAG AAG CAG AAG GCA GCT GAA GCT 1680 
Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala 
545 550 555 560 

GCC AAG GCT ATG GAG TOG CAG AAG CAG AGA TTC TTA GAA OGT TTT GOG 1728 
Ala Lys Ala Met Glu Ser Gin Lys Gin Arg Phe Leu Glu Arg Phe Ala 
565 570 575 

GTT CTT GAG GAG GAG AAA AAG GCA GCC TTA AGA GOG GOG GAG ATG GAG 1776 
Val Leu Glu Glu Glu Lys Lys Ala Ala Leu Arg Ala Ala Glu Met Glu 
580 585 590 

AGG AGG AAA ATA AGA AAC ATA ATG AAG AAT AAA GGT GTA OGC AGT TOG 1824 
Arg Arg Lys He Thr Asn He Met Lys Asn Lys Gly Val Arg Ser Ser 
595 600 605 

GAT TCG GTG COG CTT GTG GAG GGG AAT OGC TOT GTT ACT GAG AGT TOT 1872 
Asp Ser Val Pro Leu Val Glu Gly Asn Arg Ser Val Thr Glu Ser Ser 
610 615 620 

TGT AGA AAT 06G TTT OGT TTT TGT AGA AAT 06G TTT OGT TTT TCA TGT 1920 
Cys Arg Asn Arg Phe Arg Phe Cys Arg Asn Arg Phe Arg Phe Ser Oys 
625 630 635 640 

TCT GTA ATG T6A 1932 
Ser Val Met 



(2) INFORMATION FOR SBQ ID NOsSt 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 643 amino acids 

(B) T7PE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TTPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
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Ket 
1 



8er 



Pro He 



Leu 
5 



Gly 



Tyr 



Trp 



Lye 



He 
10 



Lya 



Gly Leu 



Val 



Gin Fro 
15 



Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lye Tyr Glu Glu Hie Leu 
20 25 30 

Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu 
35 40 45 

Gly Leu Glu Phe Pro Aen Leu Pro Tyr Tyr He Asp Gly Asp Val Lys 
50 55 60 

Leu Thr Gin Ser Met Ala He He Arg Tyr He Ala Asp Lys His Asn 
65 70 75 80 

Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu 
85 90 95 

Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu 
115 120 125 

Met Leu Lys Met Phe Glu Asp Arg Leu Cys Bis Lys Thr Tyr Leu Asn 
130 . 135 140 

Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
145 150 155 160 

Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu 
165 170 175 

Val Cys Phe Lys Lys Arg He Glu Ala He Pro Gin He Asp Lys Tyr 
180 185 190 

Leu Lys Ser Ser Lys Tyr He Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Leu Val Pro Arg 
210 215 220 

Gly Ser Pro Ser Gin Leu Gin Gin Ala Glu Asn Asn He Thr Asn Ser 
225 230 235 240 

Lys Lys Glu Met Thr Lys Leu Arg Glu Lys Val Lys Lys Ala Glu Lys 
245 250 255 

Glu Lys Leu Asp Ala He Asn Arg Ala Thr Lys Leu Glu Glu Glu Arg 
260 265 270 

Asn Gin Ala Tyr Lys Ala Ala His Lys Ala Glu Glu Glu Lys Ala Lys 
275 280 285 

Thr Phe Gin Arg Leu He Thr Phe Glu Ser Glu Asn He Asn Leu Lys 
290 295 300 

Lys Arg Pro Asn Asp Ala Val Ser Asn Arg Asp Lys Lys Lys Asn Ser 
305 310 315 320 

Glu Thr Ala Lys Thr Asp Glu Val Glu Lys Gin Arg Ala Ala Glu Ala 
325 330 335 

Ala Lys Ala Val Glu Thr Glu Lys Gin Arg Ala Ala Glu Ala Thr Lys 
340 345 350 

Val Ala Glu Ala Glu Lya Arg Lys Ala Ala Glu Ala Ala Lys Ala Val 
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355 360 365 

61u Thr Gltt Lye OXn Arg Ala Ala Glu Ala Thr Lye Val Ala Glu Ala 
370 375 380 

Glu Lye Gin Lye Ala Ala Glu Ala Ala Lye Ala Val Glu Thr Glu Lya 
385 390 395 400 

Gin Arg Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Arg 
405 410 415 

Ala Ala Glu Ala Net Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala 
420 425 430 

Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala 
435 440 445 

Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys 
450 455 460 

Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala 
465 470 475 480 

Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala 
485 490 495 

Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys 
500 505 510 

Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys 
515 520 525 

Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala 
530 535 540 

Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala 
545 550 555 560 

Ala Lys Ala Met Glu Ser Gin Lys Gin Arg Phe Leu Glu Arg Phe Ala 
565 570 575 

Val Leu Glu Glu Glu Lys Lys Ala Ala Leu Arg Ala Ala Glu Met Glu 
580 585 590 

Arg Arg Lys lie Thr Asn lie Met Lys Asn Lys Gly Val Arg Ser Ser 
595 600 605 

Asp Ser Val Pro Leu Val Glu Gly Asn Arg Ser Val Thr Glu Ser Ser 
610 615 620 

Cya Arg Asn Arg Phe Arg Phe C^s Arg Asn Arg Phe Arg Phe Ser Cys 
625 630 635 640 

Ser Val Met 

(2) INFORMATION FOR SEQ ID NO:9x 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 1419 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ii) MOLECOLE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) IX)CATION: 1..1416 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9s 

ATG TCC CCT ATA CTA GOT TAT TG6 AAA ATT AA6 66C CTT 6TG CAA CCC 48- 
Met Ser Pro He Leu Gly Tyr Trp Lya He Lys Gly iieu Val Gin Pro 
15 10 15 

ACT 06A CTT CTT TTG GAA TAT CTT GAA GAA AAA TAT GAA GAG CAT TTG 96 
Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu 
20 25 30 

TAT GAG CGC GAT GAA GGT GAT AAA TGG C6A AAC AAA AAG TTT GAA TTG 144 
Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu 
35 40 45 

GGT TTG GAG TTT CCC AAT CTT CCT TAT TAT ATT GAT GGT GAT GTT AAA 192 
Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr He Asp Gly Asp Val Lys 
50 55 60 

TTA ACA CA6 TCT ATG GCC ATC ATA CGT TAT ATA GCT GAC AAG CAC AAC 240 
Leu Thr Gin Ser Met Ala He He Arg Tyr He Ala Asp Lys His Asn 
65 70 75 80 

ATG TTG GGT GGT TGT CCA AAA GAG CGT GGA GAG ATT TCA ATG CTT GAA 288 
Met Leu Gly Gly Cys Pro Lya Glu Arg Ala Glu He Ser Met Leu Glu 
85 90 95 

GGA 6CG GTT TTG GAT ATT AGA TAG GGT GTT TCG AGA ATT GCA TAT AGT 336 
Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

AAA GAC TTT GAA ACT CTC AAA GTT GAT TTT CTT AGC AAG CTA CCT GAA 384 
Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu 
115 120 125 

ATG CTG AAA ATG TTC GAA GAT OGT TTA TGT CAT AAA ACA TAT TTA AAT 432 
Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn 
130 135 140 

GGT GAT CAT GTA ACC CAT CCT GAC TTC ATG TTG TAT GAC GCT CTT GAT 480 
Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
145 150 155 160 

GTT GTT TTA TAG ATG GAC CCA ATG TGC CTG GAT GCG TTC CCA AAA TTA 528 
Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu 
165 170 175 

GTT TGT TTT AAA AAA CGT ATT GAA GCT ATC CCA CAA ATT GAT AAG TAC 576 
Val Cys Phe Lys Lys Arg He Glu Ala He Pro Gin He Asp Lys Tyr 
180 185 190 

TTG AAA TCC AGC AAG TAT ATA GCA TGG CCT TTG CAG GGC TGG CAA GCC 624 
Leu Lys Ser Ser Lys Tyr He Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

ACG TTT GGT GGT GGC GAC CAT CCT CCA AAA TCG GAT CTG ATC GAA GGT 672 
Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Leu He Glu Gly 
210 215 220 



CGT G6G ATC CCC CCG GGC TGC AGG AAT TCC ACG AAG GTT GCC GAA GCG 
Arg Gly He Pro Pro Gly Cys Arg Asn Ser Thr Lys Val Ala Glu Ala 
225 230 235 240 



720 
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GAG AAG CAG AAG GCA GCT GAA GCC A06 AAG GTT GCC GAA GCG GAG AAG 768 
Glu X*y8 Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lye 
245 250 255 

GAG AGG GGA GCT GAA GCC ACG AAG GTT GCC GAA GCG GAG AAG CAG AAG 816 
Gin Arg Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys 
260 265 270 

GCA GCT GAA GCC ACG AAG GTT GCC GAA GCG GAG AAG CAG AGG GCA GCT 864 
Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Arg Ala Ala 
275 280 285 

GAA GCC AGO AAG GTT GCC GAA GCG GAG AAG CAA AAG GCA GCT GAG GCC 912 
Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala 
290 295 300 

ACG AAG GTT GCC GGA GAC GAG AAG CAG AAG GCA GCT GAA GCC ACG AAG 960 
Thr Lys Val Ala Gly Asp Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys 
305 310 315 320 

GTT GCC GAA GCG GAG AAG CAG AAG GCA GCT GAA GCC ACG AAG GTT GCC 1008 
Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala 
325 330 335 

GAA GOG GAG AAG CAG AAG GCA GCT GAA GCC ACG AAG GTT GCC GAA GCG 1056 
Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala 
340 345 350 

GAG AAG CAG AAG GGA GCT GAA GCC ACG AAG GTT GCC GAA GOG GAG AAG 1104 
Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys 
355 360 365 

CAG AAG GCA GCT GAA GCC ACG AAG GTT GCC GAA GCG GAG AAG CAG AAG 1152 
Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys 
370 375 380 

GCA GCT GAA GCC ACG AAG GTT GCC GAA GOG GAG AAG CAG AAG GCA GCT 1200 
Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala 
385 390 395 400 

GAA GCC ACG AAG GTT GCC GAA GOG GAG AAG CAG AAG GCA GCT GAA GCC 1248 
Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala 
405 410 415 

ACG AAG GTT GCC GAA GOG GAG AAG CAG AAG GCA GCT GAA GOO ACG AAG 1296 
Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys 
420 425 430 

GTT GCC GAA GCG GAG AAG CAG AAG GCA GCT GAA GCC ACG AAG GTT GCC 1344 
Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala 
435 440 445 

GAA GCG GAG AAG CAG AAG GTA GGT GAG GOT GAT CAA GCT TAT OGA TAG 1392 
Glu Ala Glu Lys Gin Lys Val Gly Glu Ala Asp Gin Ala Tyr Ara Tyr 
450 455 460 

CGT CGG GAA TTC ATC GTG ACT GAC TGA 1419 
Arg Arg Glu Phe lie Val Thr Asp 
465 470 



(2) INFORMATION FOR SEQ ID NOslOs 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 472 amino acids 

(B) T^TFE: amino acid 
(D) TOPOLOGY t linear 
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(11) MDLBCDLE TTPBs protein 

(xl) SEQUENCE DESCRIPTIONS SEQ ID NO: 10s 

Met Ser Pro lie Leu 61y Tyr Trp Lys lie Lya Gly Leu Val Gin Pro 
15 10 15 

Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lye Tyr Glu Glu His Leu 
20 25 30 

Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu 
35 40 45 

Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr lie Asp Gly Asp Val Lys 
50 55 60 

Leu Thr Gin Ser Met Ala lie He Arg Tyr He Ala Asp Lys His Asn 
65 70 75 80 

Met I.eu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu 
85 90 95 

Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lya Leu Pro Glu 
115 120 125 

Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn 
130 135 140 

Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
145 150 155 160 

Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu 
165 170 175 

Val Cys Phe Lys Lys Arg He Glu Ala He Pro Gin He Asp Lys Tyr 
180 185 190 

Leu Lys Ser Ser Lys Tyr He Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Leu He Glu Gly 
210 215 220 

Arg Gly He Pro Pro Gly Cys Arg Asn Ser Thr Lys Val Ala Glu Ala 
225 230 235 240 

Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys 
245 250 255 

Gin Arg Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys 
260 265 270 

Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Arg Ala Ala 
275 280 285 

Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala 
290 295 300 

Thr Lys Val Ala Gly Asp Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys 
305 310 315 320 

Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala 
325 330 335 
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Gitt Ala 61u Lya Gin Lys Ala Ala Glu Ala Thr Lys Val Ala 61u Ala 
340 345 350 

Glu Lye Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lya 
355 360 365 

Gin Lys Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys 
370 375 380 

Ala Ala Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala 
385 390 395 400 

Glu Ala Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala 
405 410 415 

Thr Lys Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys 
420 425 430 

Val Ala Glu Ala Glu Lys Gin Lys Ala Ala Glu Ala Thr Lys Val Ala 
435 440 445 

Glu Ala Glu Lys Gin Lys Val Gly Glu Ala Asp Gin Ala Tyr Arg Tyr 
450 455 460 ^ ^ i 

Arg Arg Glu Phe He Val Thr Asp 
465 470 
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WHAT WP ClJVnt yg; 

* !• A polypeptide having a sequence that corresponds 
to the amino acid sequence of at least one of the 
C-termlnal and N-terminal nonrepetltive regions of the 
5 TCR27 protein. 

2. A polypeptide as claimed In Claim 1^ wherein said 
polypeptide comprises an amino acid sequence of one or 
more repeats from the central region of the TCR27 
protein. 

10 3. A polypeptide as claimed in Claim 2, wherein said 

polypeptide corresponds to the N-terminal nonrepetltive 
region of the TCR27 protein and at least one repeat from 
the central region of the TCR27 protein, and does not 
correspond to the C- terminal nonrepetltive region. 

15 4. A polypeptide ias claimed in Claim 1, additionally 

comprising a linker sequence at either the N-terminus or 
the C-terminus to facilitate attachment or conjugation of 
said polypeptide to a carrier molecule in a liquid or 
solid support system. 

20 5. A polypeptide as claimed in Claim 2, additionally 

comprising a linker sequence at either the N-terminus or 
the C-terminus to facilitate attachment or conjugation of 
said polypeptide to a carrier molecule in a liquid or 
solid support system. 

25 6. A polypeptide as claimed in Claim 3, additionally 

comprising a linker sequence at either the N-terminus or 
the C-terminus to facilitate attachment or conjugation of 
said polypeptide to a carrier molecule in a liquid or 
solid support system. 

30 7. A polypeptide as claimed in Claim 1, wherein said 

polypeptide is substantially pure. 

8. A polypeptide as claimed in Claim 2, wherein said 
polypeptide is substantially pure. 

9. A polypeptide as claimed in Claim 3^ wherein said 
35 polypeptide is substantially pure. 

10. An isolated polynucleotide encoding a 
polypeptide as claimed in Claim 1. 
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11. An isolated polynucleotide encoding a 
polypeptide as claimed in Claim 2. 

12. An isolated polynucleotide encoding a 
polypeptide as claimed in Claim 3. 

13. A cell transformed with a recombinant plasmid 
that expresses a polypeptide as claimed in Claim l. 

14. A cell transformed with a recombinant plasmid 
that expresses a polypeptide as claimed in Claim 2. 

15. A cell transformed with a recombinant plasmid 
that expresses a polypeptide as claimed in Claim 3. 

16.. A method for detecting the presence of 
antibodies to T, cruzi in an individual, comprising the 
steps of: 

contacting a putative anti-T. cruzi antibody- 
containing sample from an individual with a polypeptide 
as claimed in Claim l that is attached or conjugated to 
a carrier molecule or attached or conjugated to a solid 
phase; 

allowing anti-r. cruzi antibodies in said sample to 

bind to said polypeptide; 

washing away unbound anti-r. cruzi antibodies; and 
adding a compound that enables detection of the anti- 

r. crozl antibodies which are specifically bound to the 

polypeptide. 

17. A method for detecting the presence of 
antibodies to T. cruzi in an individual, comprising the 
steps of: 

contacting a putative anti-r. cruzi antibody- 
containing sample from an individual with a polypeptide 
as claimed in Claim 2 that is attached or conjugated to 
a carrier molecule or attached or conjugated to a solid 
phase; 

allowing anti-T. cruzi antibodies in said sample to 

bind to said polypeptide; 

washing away unbound anti-r. cruzi antibodies; and 
adding a compound that enables detection of the anti- 

T. cruzi antibodies which are specifically bound to the 

polypeptide. 



wo 9505797 PCT/US9S;03191 

- 48 - 

18. A method for detecting the presence of 
antibodies to T. cruzi in an individual, comprising the 
steps of: 

contacting a putative anti-r. cruzi antibody- 
5 containing sample from an individual with a polypeptide 
as claimed in Claim 3 that is attached or conjugated to 
a carrier molecule or attached or conjugated to a solid 
phase; 

allowing anti-T. cruzi antibodies in said sample to 
10 bind to said polypeptide; 

washing away unbound anti-r. cruzi antibodies; and 
adding a compound that enables detection of the anti- 

r. cruzi antibodies which are specifically bound to the 

polypeptide. 

15 19. A method as claimed in Claim 16, wherein the 

compound that enables detection of the anti-r. cruzi 
antibodies is selected from the group consisting of a 
colorometric agent, a fluorescent agent, a 
chemiluminescent agent and a radionuclide. 

20 20. A kit for diagnosing the presence or anti-r. 

ciruzl antibodies in a saiiq;>le, comprising: 

a container in which a polypeptide having a sequence 
that corresponds to the amino acid sequence of at least 
one of the C-terminal and N-terminal nonrepetltive 

25 regions of the TCR27 protein is attached or conjugated to 
a carrier molecule or attached or conjugated to a solid 
phase; and 

directions for carrying out the method as claimed in 
Claim 16. 

30 21. A kit as claimed in Claim 18, additionally 

comprising a container of a compound that binds to anti- 
r. cruzi antibodies and that renders said antibodies 
detectable. 
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FIG. 2A'I 



MSPILGYWKIKGLVQPTRLL 
ATGTCCCCTATACTAGGTTATTGGAAAATTAAGGGCCTTGTGCAACCCACTCGACTTCTT 

+ + + + + gQ 

LEYLEEKYEEHLYERDEGDK 

TTGGAATATCTTGAAGAAAAATATGAAGAGCATTTGTATGAGCGC6ATGAAGGTGATAAA 
+ + + + 120 

WRNKKFELGLEFPNLPYYID 
TGGCGAAACAAAAAGTTTGAATTGGGTTTGGAGTTTCCCAATCTTCCTTATTATATTGAT 
+ + + + ^ —180 

GDVKLTQSMAI IRYIADKHN 

GGTGATGTTAAATTAACACAGTCTATGGCCATCATACGTTATATAGCTGACAAGCACAAC 
+ + + + + 240 

MLGGCPKERAEISMLEGAVL 
ATGTTGGGTGGTTGTCCAAAAGAGCGTGCAGAGATTTCAATGCTTGi\AGGAGCGGTTTTG 

+ + + + ^. 

DIRYGVSRIAYSKDFETLKV 

GATATTAGATACGGTGTTTCGAGAATTGCATATAGTAAAGACTTTGAAACTCTCAAAGTT 
+ + + + + 

DFLSKL PEMLKMFEDRLCHK 

GATTTTCTTAGCAAGCTACCTGAAATGCTGAAAATGTTCGAAGATCGTTTATGTCATAAA 
+ + + + ^ 420 

TYLNGDHVTHPDFMLYDALD 

ACATATTTAAATGGTGATCATGTAACCCATCCTGACTTCATGTTGTATGACGCTCTTGAT 
+ ^ + + ^ 

VVLYMDPMCLDAFPKLVCFK 

GTTGTTTTATACATGGACCCAATGTGCCTGGAT6CGTTCCCAAAATTAGTTTGTTTTAAA 
+ __+ + + + 

KRI EAIPQIDKYLKSSKY lA 
AAACGTATTGAAGCTATCCCACTVAATTGATAAGTACTTGAAATCCAGCAAGTATATAGCA 

+ + + + + gOQ 

WPLQGWQATFGGGDHPPKSD 
TGGCCTTTGCAGGGCTGGCAAGCCACGTTTGGTGGTGGCGACCATCCTCCAAAATCGGAT 

+ + + + + ggQ 

LVPRGSPSQLQQAENNITNS 

CTGGTTCCGCGTGGATCCCCGTCCCAGCTCCAACAGGCAGAAAATAATATCACTAATTCC 
+ + + + + 720 
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FIG. 2A-2 

KKEMTKLREKVKKAEKEKLD 
AAAT^AAGAAATGACJ^AAGCTACGT^GATU^GTGT^UVAAGGCCGAGT^^^ 
+ + + + + 780 

AINRATKLEEERNQAYKAAH 
GCCATTAACCGGGCAACCT^GCTGGAAGAGGAACGATVACCT^GCGTACAT^GCAGCAC^ 
+ + + + + 840 

KAEEEKAKTFQRLITFESEN 
AAGGCAG2U;GAGGAA7U\GGCTAAAACATTTCAACGCCTTATAACATTTGAGTCGGAAAAT 
+ + + + + 900 

INLKKRPNDAVSNRDKKKNS 
ATTAACTTAAAGAAAAGGCCAAATGACGCAGTTTCAAATCGGGATAAGAAAAAAAATTCT 
+ + ^ + + 960 

ETAKTDEVEKQRAAEAAKAV 
GAAACCGCAAAAACTGACGTVAGTAGAGAAACAGAGGGCGGCTGAGGCTGCCAAGGCCGTG 
+ + + + + 1020 

ETEKQRAAEATKVAEAEKRK 
GAGACGGAGAAGCAGAGGGCAGCTGAGGCCACGAAGGTTGCCGAAGCGGAGAAGCGGAAG 
+ + + + + 1080 

AAEAAK AVETEKQRAAEATK 

GCAGCTGAGGCCGCCAAGGCCGTGGAGACGGAGAAGCAGAGGGCAGCTGTVAGCCACGAAG 
+ + + ^ + 1140 

VAEAEKQKA A E AAKAVETEK 
GTTGCCGAAGCGGAGAAGCAGAAGGCAGCTGAGGCCGCCTUVGGCCGTGGAGACGGAGAAG 
+ + + + + 1200 

QRAAEATKVAEAEKQRAAEA 

CAGAGGGCAGCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAGGGCAGCTGAAGCC 
+ + + + + 1260 

MKVAEAEKQKAAEATKVAEA 
ATGAAGGTTGCCGAAGCGGAGAAGCAGAAGGCAGCTGAGGCCACGAAGGTTGCCGAAGCG 
+ + + + + 1320 

EKQKAAEATKVAEAEKQKAA 
GAGAAGCAGAAGGCAGCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAGGCAGCT 
+ + + + + 1380 

EATKVAEAEKQKAAEATKVA 
GAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAGGCAGCTGAAGCCACGAAGGTTGCC 
+ + + + + 1440 
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FIG. 2A'3 



EAEKQKAAEATKVAEAEKQK 

GAAGCGGAGAAGCAGAAGGCAGCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAG 
+ + + + + 1500 

AAEATKVAEAEKQKAAEATK 

GCTUSCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAGGCAGCTGAAGCCACGAAG 
+ + + + + 1560 

VAEAEKQKAAEATKVAEAEK 
GTTGCCGAAGC6GAGAAGCAGAAGGCAGCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAG 
■ +-- -+ + + + 1 620 

QKAAEATKVAEAEKQKAGEF 

CAGAAGGCAGCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAGGCAGGGGAATTC 
+ _ + + + + 1 680 

I V T D ♦ 
ATC6TGACTGACTGA 
+-1695 
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FIG. 2B-I 



MSPILGYWKIKGLVQPTRLL 
ATGTCCCCTATACTAGGTTATTGGAAAATTAAGGGCCTTGTGCAACCCACTCGACTTCTT 

+ + + + ^ gQ 

LEYLEEKYEEHLYERDEGDK. 

TTGGAATATCTTG2\AGAAAAATATGAAGAGCATTTGTAT6AGCGCGATGAAGGTGATAAA 
+ + + + J20 

WRNKKFELGLEFPNLPYYID 
TGGCGAAACAAATVAGTTTGAATTGGGTTTGGAGTTTCCCAATCTTCCTTATTATATTGAT 
+ + + + + 3^80 

GDVKLTQSMAI IRYIADKHN 

GGTGATGTTAAATTAACACAGTCTATGGCCATCATACGTTATATAGCTGACAAGCACAAC 
+ + + + + ---240 

MLGGCPKERAEI SMLEGAVL 
A.TGTTGGGTGGTTGTCCAAAAGAGCGTGCAGAGATTTCAATGCTTGAAG6AGCGGTTTTG 

+ + + ^ 3QQ 

DIRYGVSRIAYSKDFETLKV 
GATATTAGATACGGTGTTTCGAGAATTGCATATAGTAAA6ACTTTGAAACTCTCAAAGTT 

+ + + + + 3gQ 

DFLSKL'PEMLKM FEDR LCHK 

GATTTTCTTAGCAAGCTACCTGAAATGCTGAAAATGTTCGAAGATCGTTTATGTCATAAA 
^ + + + ^ 

TYLNGDHVTHPDFMLYDALD 
ACATATTTAAATGGTGATCATGTAACCCATCCTGACTTCATGTTGTATGACGCTCTTGAT 
+ + + + + 480 

VVLYMDPMCLDAFPKLVCFK 

GTTGTTTTATACATGGACCCAATGTGCCTGGATGCGTTCCCAAAATTAGTTTGTTTTAAA 
+ + + + + 

KRI-EAIPQIDKYLKSSKYIA 
AAACGTATTGAAGCTATCCCACAAATTGATAAGTACTTG/^TCCAGCAAGTATATAGCA 
+ + + + + 600 

WPLQGWQATFGGGDHPPKSD 
TGGCCTTTGCAGGGCTGGCAAGCCACGTTTGGTGGTGGCGACCATCCTCCAAAATCGGAT 
+ + + + + 660 

LVPRGSPSQLQQAENNI TNS 

CTGGTTCCGC6TGGATCCCCGTCCCAGCTCCAACAGGCAGAAAATAATATCACTAATTCC 
+ + + + + 
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FIG. 2B-2 



KKEMTKLREKVKKAEKEKLD 
AAAAAAGAAATGACAAAGCTACGAGAAAAAGTGAAAAAGGCCGAGAAAGAAA7VATTGGAC 
+ + + + + 780 

AINRATKLEEERNQAYKAAH 
GCCATTAACCGGGCAACCAAGCTGGAAGAGGAACGAAACCT^GCGTACAAAGCAGCACAC 
+ + + + + 840 

KAEEEKAKTFQRLITFESEN 
AAGGCAGAGGAGGAAAAGGCTAAAACATTTCAACGCCTTATAACATTTGAGTCGGAATAT 
+ + + + 900 

INLKKRPNDAVSNRDKKKNS 
ATTAACTTAAAGAAAAGGCCAAATGACGCA6TTTCAAATCGGGATAAGAAA7V7VAAATTCT 
+ + + + + 960 

ETAKTDEVEKQRAAEAAKAV 
G7^CCGCAAA7^CTGACGA?VGTAGAGAAACAGAGGGCGGCTGAGGCTGCCAAGGCCGTG 
+ + + + + 1020 

ETEKQRAGEFIVTD* 
GAGACGGAGAAGCAGAGGGCAGGGGAATTCATCGTGACTGACTGA 
+ + + +-1065 
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FIG. 2C-I 



MSPILGYWKIKGLVQPTRLL 
ATGTCCCCTATACTAGGTTATTGGJAAATTAAGGGCCTTGTGCAACCCACTCGACTTCTT 

+ + + + + gQ 

LEYLEEKYEEHLYERDEGDK 
TTGGAATATCTTGAAGA7VAAATATGAAGAGCATTTGTATGAGCGCGATGAAGGTGATAAA 
+ + + + + 120 

WRNKKFELGLEFPNLPYYID 
TGGC6A7ACAAAAAGTTTGAATTGGGTTTGGAGTTTCCCAATCTTCCTTATTATATTGAT 

+ + + + 2gQ 

GDVKLTQSMAIIRYIADKHN 

GGTGATGTTATVATTAACACAGTCTATGGCCATCATACGTTATATAGCTGACAAGCACAAC 
+ + + + + 240 

MLGGCPKERAEISMLEGAVL 
ATGTTGGGTGGTTGTCCAAAAGAGCGTGCAGAGATTTCAATGCTTGAAGGAGCGGTTTTG 
+ + + + + 300 

DIRYGVSRIAYSKDFETLKV 

GATATTAGATACGGTGTTTCGAGAATTGCMATAGTAAAGACTTTGAAACTCTCAAAGTT 
+ + + + + 3g0 

DFLS KL PEMLKMFEDRLCHK 
GATTTTCTTAGCAAGCTACCTG7V7UVTGCTGi\A7^TGTTCGAAGATCGTTTATGTCATA7\A 
+ + + + ^ 420 

TYLNGDHVTHPDFMLYDALD 

ACATATTTAAATGGTGATCATGTAACCCATCCTGACTTCATGTTGTATGACGCTCTTGAT 
+ + + + + 4 go 

VVLYMDPMCLDAFPKLVCFK 
GTTGTTTTATACATGGACCCAATGTGCCTGGATGCGTTCCCAAAATTAGTTTGTTTTAAA 
+ + + + + 540 

KRIEAIPQIDKYLKS SKYIA 

AAACGTATTGAAGCTATCCCACAAATTGATAAGTACTTGAAATCCAGCAAGTATATAGCA 
+ + + + + 500 

WPLQGWQATFGGGDHPPKSD 

TGGCCTTTGCAGGGCTGGCAAGCCACGTTTGGTGGTGGCGACCATCCTCCAAAATCGGAT 
+ + + + + 660 

PAEAAKAMESQKQRFLERFA 
CCCCCTGAAGCTGCCAAGGCTATGGAGTCGCAGAAGCAGAGATTCTTAGAACGTTTTGCG 
+ + + + + 720 
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FIG. 2C'2 



VLEEEKKAALRAAEMERRKI 
GTTCTTGAGGAGGTVGAAAAAGGCAGCCTTAAGAGCGGCGGAGATGGAGAGGAGGTVAAATA 
+ + + + + 780 

TNIMKNKGVRSSDSVPLVEG 
ACAAACATAATGAAGAATAAAGGTGTACGCAGTTCGGATTCGGTGCCGCTTGTGGAGGGG 
+ + + + + 840 

NRSVTESSCRNRFRFCRNRF 
AATCGCTCTGTTACTGAGAGTTCTTGTAGAAATCGGTTTCGTTTTTGTAGA7VATCGGTTT 
+ + + + + 900 

RFSCSVM* 
CGTTTTTCATGTTCTGTAATGTGA 
^ + +-924 
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FIG. 2D'I 

MSPILGYWKIKGLVQPTRLL 
ATGTCCCCTATACTAGGTTATTGGAAAATTAAGGGCCTTGTGCAACCCACTCGACTTCTT 



LEYLEEKYEEHLYERDEGDK 

TTGGAATATCTTGAAGAAAAATATGAAGAGCATTTGTATGAGCGCGATGAAGGTGATAAA 
+ + + ^ _ ^ 

WRNKKFELGLEFPNLPYY I D 
TGGCGATACATW^AAGTTTGAATTGGGTTTGGAGTTTCCCAATCTTCCTTATTATATTGAT 



GDVKLTQSMAIIRYIADKHN 

GGTGATGTTAAATTAACACAGTCTATGGCCATCATACGTTATATAGCTGACAAGCACAAC 
+ + + ^ ^ 240 

MLGGCPKERAEISMLEGAVL 

ATGTTGGGTGGTTGTCCAAAAGAGCGTGCAGAGATTTCAATGCTTGAAGGAGCGGTTTTG 
_ + + + ^ ^ 

DIRYGVSRIAYSKDFETLKV 

GATATTAGATACGGTGTTTCGAGAATTGCATATAGTAAAGACTTTGAAACTCTCAAAGTT 
+ + ^ ^ ^ 

DFLSKLPEMLKMFEDRLCHK 

GATTTTCTTAGCAAGCTACCTGAAATGCTGAAAATGTTCGAAGATCGTTTATGTCATAAA 
+ + + ^ ^ 

TYLN6DHVTHPDFMLYDALD 

ACATATTTAAATGGTGATCATGTJVACCCATCCTGACTTCATGTTGTATGACGCTCTTGAT 
+ + + ^ ^ 

VVLYMDPMCLDAFPKLVC FK 

GTTGTTTTATACATGGACCCAATGTGCCTGGATGCGTTCCCAAAATTAGTTTGTTTTAAA 
+ + + + ^ 

KRI. EAIPQI DKYLKSSKY IA 

AAACGTATTGAAGCTATCCCACAAATTGATAAGTACTTGAT^TCCAGCAAGTATATAGCA 
+ + + + ^ 

WPLQGWQATFGGGDHPPKSD 

TGGCCTTTGCAGGGCTGGCAAGCCACGTTTGGTGGTGGCGACCATCCTCCAAAATCGGAT 
+ + + + ^ 

LVPRGSPSQLQQAENNI TNS 

CTGGTTCCGCGTGGATCCCCGTCCCAGCTCCAACAGGCAGAAAATAATATCACTAATTCC 
+ + + + ^ 720 
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FIG. 2D-2 



KKEMTKLREKVKKAEKEKLD 

AAAAAAGAAATGACAAAGCTACGAGAA/UUIGTGAAAAAGGCCGAGAAAGAAAAATTGGAC 
+ + + + ^ 78Q 

AINRATKLEEERNQAYKAAH 

GCCATTAACCGGGCAACCAAGCTGGAAGAGGAACGAAACCAAGCGTACAAAGCAGCACAC 
+ + + + + 840 

KAEEEKAKTFQRLI TFESEN 

AAGGCAGAGGAGGAT^GGCTAAAACATTTCAACGCCTTATAACATTTGAGTCGGAAAAT 

+ : + + + + gQO 

INLKKRPNDAVSNRDKKKNS 
ATTAACTTAAAGAA7UlGGCCAAATGACGCAGTTTCAAATCGGGATAAGAAAAaj\7VATTCT 

+ + + + 4. ggQ 

ETAKTDEVEKQRAAEAAKAV 
GAAACCGCAA7VAACTGACGAAGTAGAGAAACAGAGGGCGGCTGAGGCTGCCAAGGCCGTG 
+ + + + + 1020 

ETEKQRAAEATKVAEAEKRK 

GAGACGGAGAAGCAGAGGGCAGCTGAGGCCACGAAGGTTGCCGAAGCGGAGAAGCGGAAG 
+ + + + ^ 1Q8Q 

AAEAAKAVET EKQRAAEATK 

GCAGCTGAGGCCGCCAAGGCCGTGGAGACGGAGAAGCAGAGGGCAGCTGAAGCCACGAAG 
+ + + + + ^^t^Q 

VAEAEKQKAAEAAKAVETEK 
GTTGCCGAAGCGGAGT^GCAGAAGGCAGCTGAGGCCGCCAAGGCCGTGGAGACGGAGAAG 
+ + + + + 1200 

QRAAEATKVAEAEKQRAAEA 
CAGAGGGCAGCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAGGGCAGCTGAAGCC 
+ + + + + 1260 

MKVAEAEKQKAAEATKVAEA 

ATGAAGGTTGCCGAAGCGGAGAAGCAGAAGGCAGCTGAGGCCACGAAGGTTGCCGAAGCG 
+ + + + + 1320 

EKQKAAEATKVAEAEKQKAA 
GAGAAGCAGAAGGCAGCTGAAGCCACGAAGGTT6CCGAAGCGGAGAA6CAGAAGGCAGCT 
+ + + + — . + 1380 

EATKVAEAEKQKAAEATKVA 
GAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAGGCAGCTGAAGCCACGAAGGTTGCC 
+ + + + + 144 0 
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FIG. 2D-3 

EAE KQKAAEATKVAEAEKQK 

GAAGCGGAGAAGCAGAAGGCAGCTGAAGCCACGAAGGTTGCCGAAGC6GAGAAGCAGAAG - 
+ + + + ^ 



AAEATKVAEAEKQKAAEATK 
GCAGCTGAAGCCACGAAGGTTGCCGAZU3CGGAGAAGCA6AAGGCA6CTGAAGCCACGAAG 

+ + + + ^ jggQ 

VAEAEKQKAAEATKVAEAEK 

GTTGCCGAAGCGGAGAAGCAGAAGGCAGCTGAAGCCACG7VAGGTTGCCGAAGCG6AGAAG 
+ + + + jg2Q 

QKAAEATKVAEAEKQKAAEA 
CAGAAGGCAGCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAGGC7M3CTGAAGCT 

+ + + + ^ ^gQQ 

AKAMESQKQRFLERFAVLEE 
GCCAAGGCTATGGAGTCGCAGAAGCAGAGATTCTTAGAACGTTTTGCGGTTCTTGAGGAG 

+ + + ^ ^ J^^Q 

EKKAAL.RAAEMERRKITNIM 
GAGAAAAAGGCAGCCTTAAGAGCGGCGGAGATGGAGAGGAGGAAAATAACAAACATAATG 
+ + + + + 1800 

KNKGVRSSDSVPLVEGNRSV 
AAGAATAAAGGTGTACGCAGTTCGGATTCGGTGCCGCTTGTGGAGGGGAATCGCTCTGTT 

+ + + + ^ jggg 

TESSCRNRFRFCRNRFRFSC 

A,CTGAGAGTTCTTGTAGAAATCGGTTTCGTTTTTGTAGAAATCGGTTTCGTTTTTCATGT 
+ + + + ^ ^320 

S V M * 
TCTGTAATGT6A 
1932 
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FIG. 

MSPILGYWKIKGLVQPTRLL 
ATGTCCCCTATACTAGGTTATTGGAAAATTAAGGGCCTTGTGCAACCCACTCGACTTCTT 
+ + + + + 60 

LEYLEEKYEEHLYERDEGDK 
TTGGAATATCTTGAAGAAAAATATGAAGAGCATTTGTATGAGCGCGATGAAGGTGATAAA 
+ + + + — X 2 0 

WRNKKFELGLEFPNLPYYID. 
TGGCGAAACAAAAAGTTTGAATTGGGTTTGGAGTTTCCCAATCTTCCTTATTATATTGAT 
+ + + + + 180 

GDVKLTQSMAIIRYIADKHN 
GGTGATGTTA7\ATTAACACAGTCTATGGCCATCATACGTTATATAGCTGACAAGC7VC^ 
+ + + + + 240 

MLGGCPKERA EISMLEGAVL 
ATGTTGGGTGGTTGTCCAAAAGAGCGTGCAGAGATTTCAATGCTTGAAGGAGCGGTTTTG 
+ + + + + 300 

DIRYGVSRIAYSKDFETLKV 
GATATTAGATACGGTGTTXCGAGAATTGCATATAGTAAAGACTTTGAAACTCTCAAAGTT 
+ + + + + 360 

DFLSKL PEMLKMFEDRLCHK 
GATTTTCTTAGCAAGCTACCTGAAATGCTGAAAATGTTCGAAGATCGTTTATGTCATAAA 
+ + + + + 420 

TYLNGDHVTHPDFMLYDALD 
ACATATTTAAATGGTGATCATGTAACCCATCCTGACTTCATGTTGTATGACGCTCTTGAT 
+ + + + + 490 

VV LYMDPMC LDAFPKLVCFK 
GTTGTTTTATACAT6GACCCAATGTGCCTGGATGCGTTCCCAAAATTAGTTTGTTTTAAA 
+ ; + + + + 540 

KRIEAIP QIDKYLKSSKYIA 
AAACGTATTGAAGCTATCCCACAAATTGATAAGTACTTGAAATCCAGCAAGTATATAGCA 
+ + + + + 600 

WPLQGWQATFGGGDHPPKSD 
TGGCCTTTGCAGGGCTGGCAAGCCACGTTTGGTGGTGGCGACCATCCTCCAAAATCGGAT 
+ + + + + 660 

LIEGRGI PPGCRNSTKVAEA 
CTGATCGAAGGTCGTGGGATCCCCCCGGGCTGCAGGAATTCCACGAAGGTTGCCGAAGCG 
+ + + + + 720 
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FIG. 2E-2 



EKQKAAEATKVAEAEKQRAA 
GAGAAGCAGAA6GCAGCTGAAGCCACGAA6GTTGCCGAAGCGGAGAAGCAGAGGGCAGCT 
+ + + + + 7go 



EATKVAEAEKQKAAEATKVA 

GAAGCCACGAAGGTTGCCGT^CGGAGAAGCASAAGGCAGCTGAAGCCACGAAGGTTGCC 
+ + + + + 



EAEKQRAAEATKVAEAEKQK 

GAAGCGGAGAA6CAGA6GGCAGCTGAA6CCACGAAGGTTGCCGAAGCGGAGAAGCAAAAG 
+ + + + + 

AAEATKVAGDEKQKAAEATK 

GCAGCTGAGGCCACGAAGGTTGCC6GAGACGAGAAGCAGAAGGCAGCTGAAGCCACGAAG 
+ + + + + ggo 

VAEAEKQKAAEATKVAEAEK 
GTTGCCGAAGCGGAGAAGCAGAT^GGCAGCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAG 
+ + + + + 1020 

QKAAEATKVAEAEKQKAAEA 
CAGAAGGCAGCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAGGCAGCTGAAGCC 
+ + + + + 1080 

TKVAEAEKQKAAEATKVAEA 

ACGAAGGTTGCCGAAGCGGAGAAGCAGAAGGCAGCTGAAGCCACGAAGGTTGCCGAAGCG 
+ + + + + 1140 

EKQKAAEATKVAEAEKQRAA 
GAGAAGCAGAA6GCAGCTGAAGCCACGAA6GTTGCCGAAGCG6AGAA6CAGAAGGCAGCT 
+ + + + + 1200 

EATKVAEAEKQKAAEATKVA 
GAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAGGCAGCTGAAGCCACGAAGGTTGCC 
-+ + + + + 1260 

EAEKQKAAEATKVAEAEKQK 
GAAGCGGAGAAGCAGAAGGCAGCTGAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAG 
+ + + + + 1320 

AAEATKVAEAEKQKVGEADQ 
GCA6CTGAAGCCACGAAGGTTGCCGAAGCGGAGAAGCAGAAGGTAGGTGAGGCTGATCAA 
+ + + + + 1380 

AYRYRREFIVTD* 
GCTTATCGATACCGTCGGGAATTCATCGTGACTGACTGA 
+ + + 1419 
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FIG. 3C 





INFECTED UNINFECTED 
1 I 

2 



0 1 

T. CRUZI-INFECTED PATIENTS AND 
UNINFECTED CONTROLS VS. Ag4 



1.0- 



FIG. 3D 



0.8-1 
0.6- 

ABSORBANCE 0.4- 
0.2- 



0.00- 





INFECTED UNINFECTED 



1 2 

T. CRUZI-INFECTED PATIENTS AND 
UNINFECTED CONTROLS VS. Ag44 



SUBSTITUTE SHEET (RULE 26) 



wo 95/25797 



PCT/nS9Sffi3191 



ABSORBANCE 



1- 



-1- 



16/16 

F/a S£ 





INFECTED UNINFECTED 



1 2 

T. CRUZI-INFECTED PATIENTS AND 
UNINFECTED CONTROLS VS. Ag8 



F/a 3F 



ABSORBANCE 




1 2 
INFECTED UNINFECTED 

T. CRUZI-INFECTED PATIENTS AND 
UNINFECTED CONTROLS VS. GST 



SUBSTITUTE SHEET (RULE 26) 



INTERNATIONAL SEARCH REPORT 


Intematif ^ppticat^on No 

PCT/US 95/03191 


^prf'^cimwT'^nmB/ez cmism cimm cimm 

C07KM/44 601N33/577 


According te 


i bitanatioaal Patent Clasaficaiion (IPQ or to both oatiaQal classification and IPC 






B. FIELDS SEARCHED — 


Mixdnmn dc 

IPC 6 


Kummtatioo searched (ctanififaiion aystem fottowcd by dasaficato 

C12N C07K GOIN 










it duMiiBcidi are in 






C DOCUMENTS CONSIDERED TO BE RELEVANT 




CateEOiy* 


atalioD of doannem. widk indicatioiw where appfoprialB, oTIfae rdevflutpassa^ 


Relevant to daim No. 


X 


MOL. BIOCHEM. PARASITOL. (1993), 57(2). 
317-30 CODEN: MBIPDP;ISSN: 0166-6851. 
1993 

OTSU, KEIKO ET AL 'Interruption of a 
Trypanosoma cruzl gene encoding a protein 
containing 14-araino acid repeats by 
targeted insertion of the neomycin 
phosphotransferase gene' 
cited in the application 
the whole document 

-/-- 




1,2,7,8, 

10.11. 

13.14 


fx] F« 


iKr docHnoMs an listed in the coBlimialioa or boc C, 


[)( [ Patemtlanil 


yiMUuticfanreUrtBd 




'SpedalcHefOfkiorcitcddocunadi: ... i^docunintDtibUibedaflierihetalenal^^ 

'A' documcM defining ihegeaenl slate of the tttvMdiii not dtfi to undeiitand die principle cr ttMcry undertyingmc 

cuuuduedtobeofpMliCMiaridcvancc mvendon 
"B* a^&ueamMtratf^^^'^^'^^^^*'!^^^''^'!^ -X* documentof paitladarfckviiioe; thedabnedtei^^ 

fOiiudate camiotbecwmdMcdpovdorcannotbecomidergto 
•L- do«>em««chn»ydimwdouhlion|«ori^ involve an invcndvcitepvtod»edoaimcntii<ito 

wWchiidlcdtoortahlishttiepuhUcatioodateofanfliher •y* docummt of paitiadar relevance; the eUiinediiiv«J^ 

citation or odteripedal reason (as specified) camiol be conadered to iovolye an mvcnnvettqp^^ 


Data of the adnal conpletion of fbe inmatiooal KarEb 

15 June 1995 


29. 06. 95 


Name and 


nailing address of the ISA 

European Patent Office, P.B. 581 S PatenOaan 2 
NL * 2280 HV Rijswijk 
Td. (-4- 31-70) 340-2040. Tx. 31 6SI cpo al. 
Fas (-f 31-70) 340-3016 


Aulfaoriced ofBccr 

Hornig, H 



Ftem PCT/ISAyaiO (nana riMBl) (hily 1999) 



page 1 of 2 



INTERNATIONAL SEARCH REPORT 



inieiBidc ^ifUalioi) No 

PCT/US 95/03191 



C(Contimiatioa} DOCUMENTS CONSIDERED TO BE RELEVANT 



atolioB ordoamni; with iadicitioii. wboc appvaprin of the fdcmm 



N& 



INFECT. IMMUN. (1989). 57(7). 1959-67 
COCet: INFIBR;ISSN: 0019-9567, 
1989 

HOFT, DANIEL F. ET AL 'Trypanosoma cruzi 
expresses diverse repetitive protein 
antigens' 

cited in the application 
the whole document 

GENE. 

vol. 67. no. 1. 1988 ELSEVIER SCIENCE 
PUBLISHERS . B . V . .AMSTERDAM, NL; , 
pages 31-40. 

D.B. SMITH AND K.S. JOHNSON 'Single-step 
purification of polypeptides expressed In 
Escherichia coll as fusions with 
glutathione S-transf erase' 
cited in the application 
the whole document 

W0.A,94 01776 (ABBOTT LAB) 20 January 1994 
the whole document 

WO. A, 93 16199 (REED STEVEN G) 19 August 
1993 

the whole document 



1-21 



1-21 



1-21 
1-21 



fta PCT/BA/311 (ooolimMiM iMBn« dMM) (My Itn) 



page 2 of 2 



INTERNATIONAL SEARCH REPORT 

t i l Mi nartnii on ptlmt fnrily latirbtn 



\pplicttiaa No 

PCT/Uj 95/03191 



Patent document 
dtcd In teardi report 



PabUeatkm 
date 



Patent ftnuly 
ineniber(i) 



PubUeation 
date 



WO-A-9401776 



VIO-A-9316199 



20-01-94 


AU-B- 


4670193 


31-01-94 


EP-A- 


0649536 


26-04-95 


19-08-93 


US-A- 


5304371 


19-04-94 




CA-A- 


2129747 


15-08-93 




EP-A- 


0649475 


26-04-95 




US-A- 


5413912 


09-05-95 



Ften PCT/BA/ao (pitaal CnBy amn) (My 1M3) 



